Study: Sycophantic AI can undermine human judgment

https://arstechnica.com/feed/ Hits: 28
Summary

Another concerning finding is that study participants consistently described the AI models as objective, neutral, fair, and honest—a common misconception. “This means that uncritical advice under the guise of neutrality can be even more harmful than if people had not sought advice at all,” said Khadpe. This study did not look at possible effective interventions, per the authors, keeping the focus on the default behavior of these AI models. Changing system prompts might help, such as asking the AI to take the other person’s perspective, and/or optimizing the models at later stages to prioritize more critical behaviors. But this is such a new field that most proposed interventions still need further study. According to Cheng, preliminary results from follow-up work indicate that changing the training data sets to be less affirming, or just telling the model to begin every response with “Wait a minute,” can decrease the levels of sycophancy. The authors emphasized that the onus should not be on the users to address the issues; it should be on the developers and on policymakers. “We need to move our objective optimization metrics beyond just momentary user satisfaction towards more long-term outcomes, especially social outcomes like personal and social well-being,” said Khadpe. “At the same time, our frameworks for how we evaluate these AI systems also need to consider the broader social context in which these interactions are embedded.” “AI is already here, close to our lives, but it’s also still new,” said Cheng. “Many would argue that it’s still actively being shaped. So you could imagine an AI that, in addition to validating how you’re feeling, also asks what the other person might be feeling, or that even says, ‘Maybe close the app and go have this conversation in person.’ The quality of our social relationships is one of the strongest predictors of health and wellbeing we have. Ultimately, we want AI that expands people’s judgment and perspectives rather than narr...

First seen: 2026-03-26 18:14

Last seen: 2026-03-27 21:32