Listen to this story:
An algorithm that crunches data from brain scans to predict people’s performance on behavioral tasks is less accurate for Black Americans than for their white counterparts, according to a new study. The findings may reflect a lack of racial and ethnic diversity in imaging datasets used to train such algorithms, the researchers say.
“A lot of the algorithms we have just don’t generalize” to groups that have been historically underrepresented in science, says Mallar Chakravarty, associate professor of psychiatry at McGill University in Montreal, Canada, who was not involved in the study.
This bias may also be introduced at multiple points in the data analysis pipeline, says lead investigator Sarah Genon, professor of cognitive neuroinformatics at the University of Dusseldorf in Germany. “All the methods that we are using for pre-processing our data have been optimized for a population dominated by white Europeans or Americans,” she says.
To make the research “more useful for real people and their families, we need to be doing better at representing who those real people are” in the datasets, says Brenden Tervo-Clemmens, a postdoctoral researcher at Harvard University, who was not involved in the study. “This is how neuroscience meets the real world.”
Neuroimaging researchers have made strides over the past decade toward improving the field’s reliability — in part by developing large, multi-site imaging datasets, such as the Human Connectome Project (HCP) and the Adolescent Brain Cognitive Development (ABCD) study.
Each of these datasets contains functional magnetic resonance imaging (fMRI) brain scans from people of different races and ethnicities. But the majority of participants identify as white, Genon and her colleagues point out: 76.1 percent of the 948 HCP participants and 56 percent of the 5,351 ABCD participants. By comparison, 13.6 percent of HCP and 11.9 of ABCD study participants identify as Black or African American.
Researchers who rely on these cohorts, therefore, “are building models for a specific population,” Genon says — which became evident when she and her colleagues used an algorithm to analyze resting-state fMRI data from the HCP and ABCD studies to predict the participants’ performance on a panel of behavioral measures.
First using the HCP database, the team divided the 101 Black participants into 10 groups and matched each with a white participant of the same age and gender who had similar results on the dataset’s behavioral measures, which included evaluations of processing speed, walking endurance and working memory. The researchers split the remaining participants into the 10 groups randomly. They trained a predictive algorithm on data from nine of the groups to identify associations between the participants’ resting-state fMRI data and their behavioral results, and then used it to predict the performance of participants in the remaining group.
For the ABCD database, which includes data from multiple study sites, Genon and her colleagues matched pairs of white and Black participants based on site and brain volume, in addition to results from that study’s behavioral panel, which included measures of cognitive control and long-delay recall, for example. They split the data into 10 groups, trained the algorithm on 7 and tested their findings on the remaining 3.
For all racial and ethnic groups, trained models predicted performance on 6 of the 51 behavioral measures from the HCP cohort and 9 of the 36 measures from the ABCD cohort. But those measures were less accurately predicted for Black participants than they were for age- and gender-matched white participants for the majority of the behaviors — including performance on tests of spatial orientation, cognitive flexibility and working memory.
An algorithm trained on subsets of data from about 400 Black participants from the ABCD cohort learned different associations between brain activity and behavioral performance than when trained on data from white participants, the team found.
And although the model trained on data from Black participants yielded more accurate predictions about other Black participants’ behavioral performance, many of those predictions were still less accurate than they were for white participants.
The exact source of the racial bias is not clear, the researchers say.
One possible cause is that the majority of brain imaging data — from the atlases used to delineate brain regions to the imaging templates used to standardize scans obtained from different sites — have been developed mainly using white participants, who are not representative of the general population, Genon says.
Additionally, the behavioral tests used in these datasets have been validated for the most part in white populations, says study investigator Jingwei Li, a postdoctoral researcher at the Research Center Jülich in Germany. “There are arguments of whether those behavior measures are valid for different ethnic groups,” she says.
To address these biases, neuroimaging researchers need to “represent the populations right,” Chakravarty says, much like genetics researchers have worked to do in recent years. That means making the effort to build large, diverse cohorts, even when it’s more challenging, he says.
Teams should also “always make the disaggregated data available so that researchers who are working on these data can make choices about how they’re going to include variables like race, ethnicity, gender, even socioeconomic status,” Chakravarty says. “Having to figure out how you describe those things, and talk about those things, requires a bit of background work and sensitization on the part of researchers.”
“We’re only now really beginning to understand the impact that societal pressures, social pressures [and] adversity really play on different historically underrepresented or marginalized communities, and how that may affect brain development and growth,” he says.
Failing to address these biases in datasets that build predictive models could have serious consequences, the researchers say. The reliability of precision medicine, for example, which sometimes uses algorithms to determine the best course of action for a patient, should not change based on a person’s racial or ethnic background.
“If we don’t have good enough representation, it’s hard to expect these algorithms to do better than what we’re giving them,” Tervo-Clemmens says.
Cite this article: https://doi.org/10.53053/VPYT8075