CLA Research: Racial bias in AI training data | College of Liberal Arts

In a newly published study, Assistant Professor Chris Chen of the School of Communication tests if people understand that unrepresentative data used to train a facial expression classification AI system can result in biased performance

Chris Chen

By Colin Bowyer, Communications Manager - October 22, 2025

Artificial intelligence (AI) systems designed to identify emotions from facial expressions often produce biased results, such as associating Black faces with negative emotions. This is primarily due to skewed datasets where race can act as a confounding factor when determining emotions, e.g. a disproportionately higher number of white faces with happy emotions and black faces with unhappy emotions and vice versa. For facial recognition, AI systems factor race into their calculations of emotions, alongside facial features, thus producing biased results. Just like a frown may count toward classifying a face as showing a negative emotion, dark skin color becomes coded as a likely indicator of unhappiness.

In a new study published in Media Psychology, Cheng “Chris” Chen, assistant professor of emerging media and technology at the School of Communication, and her co-authors define race as a confound; an irrelevant variable that influences AI outcomes due to unrepresentative training data. Then, the researchers ask if laypeople can detect racial bias in AI systems caused by unrepresentative training data. The study hoped to explore how users interpret training data, identify ways to improve awareness of algorithmic bias, and propose visual cues to help communicate bias.

The basis of this study was partially motivated by an AI bias that Chen experienced in her professional life. “I was giving a virtual presentation where there were auto-generated transcripts by AI,” Chen explained. “When I saw the transcripts myself afterwards, they were totally different from what I was saying, for example, mishearing and transcribing ‘algorithmic bias’ as ‘algorithmic buys,’ but when I saw the transcripts of my native English-speaking peers, they were accurate.” The transcript misstep by the AI during the virtual presentation prompted Chen to design a study to make people aware of bias in these now universal technologies.

With nearly 800 participants, the researchers conducted three experimental studies using a prototype AI system called Emotion Reader AI, which classifies facial expressions as happy or unhappy. In the first study, Chen focused on the biased representation of races, i.e., race is a systematic error or confound in the training data. The second study focused on the lack of adequate representation of a particular race in the training data. And the third combined both types of race misrepresentation and their counterexamples.

What the findings pointed to first is that most users did not perceive bias from a snapshot of unrepresentative training data. One possible explanation is that users may have relied on simpler cues, such as accuracy to evaluate the racial bias in machine learning algorithms. Given that all facial images in the training sample were classified correctly, users might not have viewed unrepresentative training data with racial confounds as problematic.

Second, users were more likely to perceive bias when the AI system performed poorly, especially when it misclassified emotions based on race. Performance bias had a stronger impact on perceived fairness than training data visuals.

Finally, the user’s race mattered in identifying bias in the training data. Black participants were more sensitive to biased training data, especially when it portrayed Black individuals negatively. White participants were less likely to notice or be concerned about racial bias unless it affected them directly.

“The failure to use racial confounds in the training sample to infer algorithmic bias is surprising given the stark contrast between smiling White subjects and sad Black subjects featured in one of the study’s stimuli,” explained Chen. This study highlights the need for technical solutions to ensure fairness, rather than relying on user perception alone.”