Working With AI Carnegie Mellon Researchers Explore Relationships With Artificial Intelligence Tools

Marylee WilliamsTuesday, April 29, 2025

CMU teams are investigating how AI systems can improve the way researchers and developers enhance people's understanding and confidence in these systems. Many of those researchers are presenting their work at CHI 2025 this week in Japan.

Chatbot companions, generated videos, banking software and homework help. Artificial intelligence software has found its way into more and more aspects of daily life. But for these systems to work at their full potential, users need to understand and trust them. The systems also need baked-in methods for addressing biases or errors.

Researchers in Carnegie Mellon University's School of Computer Science (SCS) are investigating how AI systems can improve the way researchers and developers enhance people's understanding and confidence in these systems. They presented the work highlighted below at the Association for Computing Machinery's Conference on Human Factors in Computing Systems (CHI 2025) this week in Yokohama, Japan.

Exploring AI Literacy

AI literacy, the knowledge and skills needed to use and evaluate AI models, helps people make better decisions about AI tools and safely participate in a digital world. However, much of the AI literacy research has focused on users and developers. A team from the Human-Computer Interaction Institute (HCII) took a broader approach in their paper, "Exploring What People Need To Know To Be AI Literate: Tailoring for a Diversity of AI Roles and Responsibilities." Authors included Assistant Professor Motahhare Eslami, Tang Family Professor John Zimmerman and Ph.D. student Shixian Xie.

In the paper, which received an honorable mention at CHI 2025, researchers used service design to analyze how AI systems can create value with different people and groups. The team used this novel process, generally reserved for business, to identify the roles, responsibilities and gaps in AI literacy.

"It's interesting when you bring a different method — like service design — from a different field and apply it to a familiar field, like computer science," Eslami said. "Because of how this field is growing and changing in surprising ways, it's more and more critical to use new methods to study it, particularly for AI and AI literacy."

Researchers found 16 distinct roles in AI systems and noted missing competencies, like financial risk assessment for organizational leaders such as CEOs.

"AI literacy has largely focused on K-12 education. It's probably the majority of the research," Xie said. "But we cannot just wait for the kids to grow up to be those professionals. The professionals need help right away. They need those competencies and training now. It's a big point in our research, calling on AI literacy to move beyond classrooms."

Impact of AI Mismatches

Organizational leaders in fields ranging from agriculture to pharmaceuticals are looking to AI to improve their operations. However, as researchers from SCS and the University of Wisconsin-Madison found in their study, "AI Mismatches: Identifying Potential Algorithmic Harms Before AI Development," the failure rate of these systems is often overlooked.

AI mismatches occur when an AI model doesn't function in a way that both creates value and minimizes harm. For example, an AI system used to help child protection workers when placing children in foster care needs to demonstrate high performance with minimal errors. But if the data used to train the system is poor, there is a higher likelihood of error — which can cause significant harm.

The study evaluated AI performance based on how well it fulfilled human needs rather than predictive accuracy. Researchers analyzed 774 AI cases and identified seven critical factors, such as data quality, that contribute to AI mismatches. The team stressed the importance of early stage risk assessment and increased collaboration to minimize risks.

The HCII research team included Zimmerman, Ph.D. student Ji-Youn Jung, Herbert A. Simon Professor Jodi Forlizzi, and Assistant Professor Kenneth Holstein.

AI and Moral Decision-Making

Limited transparency about decision-making in AI tools reduces trust in those tools, which presents challenges when they're used to help make decisions with moral weight. A team of SCS researchers worked with colleagues at Duke and Yale universities to better understand if and how AI can model human moral decision-making. SCS members of the team included Hoda Heidari, the K&L Gates Career Development Assistant Professor in Ethics and Computational Technologies in the Machine Learning Department (MLD) and the Software and Societal Systems Department (S3D); and Vincent Conitzer, a professor in the Computer Science Department (CSD).

Researchers interviewed 20 study participants, asking them to make and explain their decisions about kidney allocations. Using fictional scenarios, participants chose which patient should get a kidney and why. This scenario allowed researchers to observe the nuances and dynamics of how humans make moral decisions, which changes with context and reflection. They then compared this process with how AI models make moral decisions, which can rely on consistent preferences and lack context from previous decisions.

"The AI literature has captured human moral reasoning in a simplistic manner, and our paper provides evidence that human moral decision-making is much more nuanced than we want to believe," Heidari said. "These decisions would be much easier if we all had a fixed rule in mind to determine who should get a kidney. The reality is that's not the case, and that's an important element of morality for us. Context can shift as we make more decisions, and our moral reasoning and decision-making changes, which is an important facet of morality. If our goal is to come up with AI systems compliant with our notion of morality, we have to account for those qualities of human decision-making."

Researchers noted that in some instances the decision process was also a learning process. Participants engaged in discussions to reason about their preferences, which led to some people changing their preferences. They also noted that this study is part of a growing body of research on human-computer interaction and the feasibility of human-centered AI designs.

AI Companions and Bias

AI companions, like chatbots, can serve as friends or confidants, but that doesn't mean they always have your back. HCII researchers examined how users of these systems correct AI outputs that they perceive as harmful.

Whether from biases in the training dataset or users unintentionally jailbreaking these systems, AI companions may say something hurtful to a user. Hong Shen, an assistant research professor in the HCII, and Ph.D. student Qing Xiao studied how users of these tools can guide AI companions to stop or amend hurtful behaviors. The two collaborated with Maarten Sap, an assistant professor in the Language Technologies Institute (LTI) and LTI Ph.D. student Xuhui Zhou, as well as researchers from Tsinghua University, Stanford University and George Mason University.

"AI companions are a unique case, because users develop a parasocial relationship with them," Shen said. "It's no longer the use case of a chatbot doing some research review. It's more like the chatbot giving emotional support."

Misogyny, ableism and racism were some of the types of discrimination these AI companions exhibited. Researchers analyzed how users' perceptions of these AI systems corresponded with how they corrected discriminatory statements. For example, someone might see an AI system as a baby — something that is learning and could potentially repeat the hurtful or discriminatory things another user might say. In this situation, the user correcting the AI's behavior might employ a tactic of reasoning with or preaching to the system. On the other hand, if a user sees the AI system as a machine, they might give a low feedback score when asked to evaluate the hurtful statement.

Xiao said this research examines a bottom-up approach to correcting AI's harmful behavior.

"We are thinking beyond the developers themselves," Xiao said. "In the future, I think this could be a useful approach to empower users to do the alignment job in situations when they detect harmful statements."

More information about the research mentioned above can be found on the CHI 2025 website.

For More Information

Aaron Aupperlee | 412-268-9068 | aaupperlee@cmu.edu