AI

AI sycophancy and the specialist models that are solving it

By Fayola-Maria Jack, Founder, Resolutiion

There’s a growing realisation that AI tells you what it believes you want to hear. Research has shown that convincingly written sycophantic responses can outperform correct ones, in preference and acceptance rates, a non-negligible fraction of the time.  

The implications extend far beyond disappointing chatbot interactions. In mental health applications, for example, AI systems that validate every thought pattern risk reinforcing harmful beliefs rather than encouraging healthier perspectives. In educational settings, students receive affirmation for incorrect reasoning which can leave genuine misunderstandings unaddressed. In professional decision-making, executives might use AI-generated analyses that mirror their existing biases rather than challenge flawed assumptions. 

The common thread is that sycophancy optimises for user satisfaction in the moment, not accuracy or long-term benefit. When an AI system’s primary training objective is to avoid disagreement, it learns to tell people what they want to hear, without due consideration to context and consequence. 

Conflict and dispute resolution is another example of a high-stakes setting. The problem is straightforward: if your AI system tells each party in a dispute that they’re right, you haven’t resolved anything. You’ve just created two validated but incompatible narratives, each party now more convinced of their position than before. Far from being mediation, it’s fuel for escalation. 

Why sycophancy is particularly dangerous in disputes 

General-purpose chatbots are optimised for “helpfulness” through reinforcement learning from human feedback. This creates models that learn to match user beliefs rather than challenge them. In customer service or casual conversation, this might build user satisfaction. But, in dispute resolution it actively raises the stakes of disagreement. 

The risks manifest in several distinct ways. In any dispute, each party tends to hold a self-consistent but conflicting narrative. When an AI affirms both sides without challenge, it effectively validates opposing truths. This creates an illusion of fairness where both parties feel heard, but in reality it cements division, as neither party is nudged toward recognising the other’s perspective or any underlying shared interests. 

Conflict resolution depends entirely on the perception of neutral ground. If the AI appears to say different things to each party, or later contradicts itself, the ground has been laid for trust to collapse. Indeed, the entire process risks being delegitimised. Unlike customer service scenarios where a flattering answer might build goodwill, in disputes flattery becomes a fundamental flaw because objectivity is a core currency of effective mediation. 

The false equivalence trap 

There’s another problem that’s less obvious, but just as damaging. Generalist AI models often treat two perspectives as equally valid, even when the facts don’t support them both. 

This pattern shows up across domains where AI is being deployed for sensitive decisions. In content moderation, systems can treat a factual correction and targeted harassment as equally valid “perspectives” to avoid appearing as biased. In healthcare triage, AI can give equal weight to an evidence-based concern and an unfounded medical anxiety to avoid appearing dismissive. The result in both cases is that misinformation or disproportionate responses can be legitimised, simply because generalist models are prone to validate. 

The problem intensifies when power dynamics are involved. An AI system that treats all voices as equally valid regardless of context can end up amplifying whoever asserts their position most confidently, even when the facts don’t support them. 

Take a dispute where one side makes demonstrably false claims, or where there’s a clear power imbalance. The AI responds with: “You both raise important points”. This is not neutrality, because it gives credibility to misinformation. 

There’s a difference between being neutral and pretending every argument holds water. Good mediators know when to call out something that’s plainly incorrect, and when to rein in someone who’s dominating to the detriment of the overall process. But a generalist AI model trained to keep everyone happy? It will agree with the person making the boldest claims, even if those claims don’t stand up.  

The need for specialist models 

Rather than reducing AI adoption in sensitive domains, the recognition of sycophancy will likely accelerate the shift toward specialist models. General-purpose large language models are optimised for casual question-and-answer interactions, not impartial input. 

Specialist systems work differently because they’re built with entirely different objectives. They define success not as user satisfaction but as resolution or progress, which changes everything about how they behave. 

The technical differences matter here. Specialist models are trained on what’s called privileged data – information not available in generic internet training. In the conflict and dispute resolution sphere, this includes a plethora of proprietary data, shaped on domain best practice. These data points teach the system what effective conflict navigation looks like in practice, rather than what makes people feel validated. 

Beyond the data itself, specialist systems are built around structured dialogue frameworks as opposed to free-form conversation. This structured approach helps the system distinguish between validating feelings and validating positions. It can acknowledge that someone feels frustrated without confirming they’re right to be frustrated, or surface what stakeholders actually need instead of reflecting back what they’ve said. 

Specialist conflict resolution systems incorporate evaluation methods that assess tonal neutrality as well as factual correctness. Unlike mainstream models that lean heavily on reinforcement learning from human feedback, specialist fine-tuning is guided by deep technical and domain expertise. 

Transparency becomes a core feature rather than an afterthought. If the system corrects a factual inaccuracy, it explains why. This prevents perceptions of hidden bias and frames any correction as maintaining fairness rather than taking sides. 

Looking ahead 

So, far from rejection of AI in conflict and dispute resolution, what we’ll likely see is stronger segmentation emerge across the board. Consumer large language models will continue to handle casual use cases where agreement and helpfulness are appropriate objectives. Specialist systems will be developed for contexts where accuracy, neutrality or professional standards must take precedence over user satisfaction. 

This shift has implications well beyond dispute resolution. Healthcare diagnostics, legal analysis, financial advice, therapeutic support – any domain where professional judgement matters – will likely see similar movements towards purpose-built models trained on privileged data and evaluated against domain-specific standards. The recognition that one-size-fits-all AI carries inherent risks in high-stakes contexts will drive more sophisticated, context-appropriate deployment strategies. 

In conflict resolution, specialist models will be adopted precisely because they’re engineered to avoid the pitfalls of sycophancy and align with the professional norms of mediation and negotiation practice.  

But the broader lesson applies across many sensitive domains – the organisations and practitioners who understand when general-purpose AI isn’t fit for purpose will be the ones who integrate these technologies effectively. Not because the technology tells everyone what they want to hear, but because it’s designed to do something far more valuable: provide genuinely useful support that serves needs, rather than preferences. 

Author

Related Articles

Back to top button