Ensuring the fairness of algorithms that predict patient disease risk

BMJ Health & Care Informatics (2022). DOI: 10.1136/bmjhci-2021-100460″ width=”800″ height=”530″/>

Models performance across evaluation metrics, stratified by demographic group, evaluated on the test set. Left panel shows AUROC and absolute calibration error. Right panel shows false negative rate, false positive rate and threshold calibration error at two treatment thresholds (7.5% and 20%). EO, equalized odds. PCE, original pooled cohort equation; rPCE, revised PCE. rUC, readjusted model. UC, unconstrained model. credit: BMJ Health & Care Informatics (2022). DOI: 10.1136/bmjhci-2021-100460

“To treat or not to treat?” is a question that clinicians face all the time. Some look to disease risk prediction models to aid decision making. These models predict patients who are more or less likely to develop the disease and therefore likely to benefit from treatment, based on demographic factors and medical data.

As these tools grow, Medical field Specifically in this area of ​​clinical guidance, researchers at Stanford University and others are working on ways to ensure the fairness of the algorithms underlying models. Bias is emerging as a significant problem when models are not developed using data that reflects diverse populations.

In a new study, researchers at Stanford University examined key clinical guidelines for cardiovascular health that recommend using risk calculators to guide prescribing decisions for black women, white women, black men, and white men. Researchers have examined two methods that have been proposed to improve the fairness of calculator algorithms. One approach, known as group rebalancing, is to risk model By subgroup of patients to better match the frequency of the observed outcome. Her second approach, called equalized odds, ensures that the error rate is the same for all groups. Researchers found that the readjustment approach was overall in better agreement with guideline recommendations.

This finding highlights the importance of building algorithms that consider the full context relevant to the population they serve.

“in the meantime machine learning There are many possibilities in medical settings and in other social settings, but it is possible that these technologies will exacerbate existing health inequalities,” said a Stanford Ph.D. student in computer science and published in said Agata Foryciarz, lead author of the study. BMJ Health & Care Informatics“Our results suggest that impartial evaluation of disease risk prediction models can make their use more responsible.”

In addition to Foryciarz, the researchers include senior author Nigam Shah, chief data scientist at Stanford Healthcare and faculty member at Stanford HAI. Google researcher Stephen Pfohl and his Birju Patel, Google Health Clinical Specialist.

prudent prevention

The clinical guidelines evaluated in this study are for primary prevention of atherosclerotic cardiovascular disease. This condition is caused by the buildup of fat, cholesterol, and other substances on the walls of arteries as so-called plaque. Adhesive plaque blocks blood flow and can lead to adverse outcomes such as stroke and kidney failure. there is.

Guidelines issued by the American College of Cardiology and the American Heart Association give recommendations on when patients should start taking drugs called statins, which are drugs that lower certain cholesterol levels that cause arteriosclerosis.

Atherosclerotic cardiovascular disease guidelines consider blood pressure, cholesterol levels, diabetes diagnosis, smoking status, medical measures such as hypertension treatment, and demographics of sex, age, and race. Based on these data, the guidelines suggest the use of calculators to estimate a patient’s overall risk of developing cardiovascular disease within 10 years. Patients identified as having intermediate or high risk of disease are encouraged to initiate statin treatment. Instead, for patients with borderline or low-risk disease, statin therapy may be unnecessary or undesirable given the potential drug side effects.

“If you as a patient are perceived to be at higher risk than you are, you may be given a statin you don’t need,” says Foryciarz. However, if you actually need to take a statin, your doctor may not be able to take preventative measures that could have prevented heart disease later on.”

Clinical practice guidelines increasingly encourage physicians to use clinical risk prediction models for different conditions and patient populations. The proliferation of calculators to aid in medical decision-making (for example, phones and other electronic devices used in clinical settings) means that such apps are readily available.

“Clinicians are likely to encounter and use these algorithm-based decision support tools more and more, so we designed them to ensure they are as fair and accurate as possible,” Foryciarz said. It is important that people work hard,” he said.

Improved risk assessment

For their study, Foryciarz and colleagues used cohorts of over 25,000 patients aged 40 to 79 collected from several large datasets. The researchers compared the actual incidence of atherosclerosis in patients with predictions made by risk models. As part of these experiments, the researchers built models using two approaches, group readjustment and odds equalization, and combined computer-generated estimates of the model with no equity adjustment. We compared estimates generated by a simple model calculator.

To individually readjust for each of the four subgroups, run the model on a subset of each subgroup to obtain the risk score for the actual percentage of patients who develop disease, then readjust for the broader subgroup. I had to adjust the underlying model. This approach succeeded in increasing the desired fit of the model with the low-risk patient guidelines. On the other hand, differences in error rates across subgroups were evident, especially in the high-risk ends.

In contrast, the equalized-odds approach required building a new predictive model that was constrained to yield equalized error rates across the population.In practice, this approach achieves something similar false positive Population-wide false-negative rate. False positives refer to patients identified as high risk who are to start statin therapy but do not develop atherosclerotic cardiovascular disease; false negatives refer to patients identified as low risk but , which refers to patients who developed atherosclerotic cardiovascular disease, may have benefited from taking statins.

Using this equalized odds approach ultimately skewed the decision threshold levels for the various subgroups. Compared to the group readjustment approach, using a calculator built with equalized odds in mind could potentially prevent some of the adverse outcomes that lead to statin under- and over-prescribing. I didn’t.

Improving accuracy through group recalibration requires additional time and effort to adjust the original model rather than leaving it alone, but this is the price to pay for improved clinical outcomes. is a small amount. An additional caveat is that dividing the population into subgroups increases the likelihood that the sample size will be too small to effectively assess risk within the subgroups, while at the same time extending the model’s predictions to other subgroups. diminishing ability.

Overall, algorithm designers and clinicians alike should keep in mind which fairness metrics to use for evaluation and which, if any, to use for model tuning. they again, model Or if the calculator is going to be used in the real world, and how erroneous predictions can lead to clinical decisions and negative health consequences down the road. and point out that further development of the algorithmic fairness approach could improve all results.

“It’s not always easy to pinpoint which of the many subgroups to focus on, but considering some subgroups is better than not considering them,” Foryciarz says. “Developing algorithms that serve a diverse population means that the algorithms themselves need to be developed with that diversity in mind.”

When Algorithmic Fairness Fix Fails: The Case of Keeping Humans in a Loop

For more information:
Agata Foryciarz et al, Evaluating Algorithm Fairness in the Presence of Clinical Guidelines: For Atherosclerotic Cardiovascular Disease Risk Estimation, BMJ Health & Care Informatics (2022). DOI: 10.1136/bmjhci-2021-100460

Quote: Ensuring the Fairness of Algorithms for Predicting Patient Disease Risk (1 August 2022) on 1 August 2022 Taken from disease.html

This document is subject to copyright. No part may be reproduced without written permission, except in fair trade for personal research or research purposes. Content is provided for informational purposes only.

Ensuring the fairness of algorithms that predict patient disease risk

Source link Ensuring the fairness of algorithms that predict patient disease risk

Show More

Related Articles

Back to top button