The Sycophancy Fallacy: Why You May be Worried About the Wrong Bias with Search
AI search tools aren't "agreeing" with users—they are retrieval systems. Confusing the two is a category error that obscures the real risks.
I’ve had a post titled “Common Misconceptions Librarians Have About Information Retrieval” sitting in my drafts for months. I never published it because some points felt like trivial nitpicking (lexical search doesn’t have to be Boolean; overly narrow definitions of “neural search”), and others were eventually covered elsewhere.
But recently I’ve encountered two related ideas that concern me more, because they’re seductive and contain just enough truth to seem plausible—while obscuring where the genuine problems actually lie.
I’m not naming sources—this isn’t about embarrassing anyone—but trust me, these aren’t strawmen arguments I made up to argue with.
Misconception 1: “AI search is dangerous because it only gives you what you want”
This argument typically starts with an accurate premise: LLMs like ChatGPT exhibit sycophantic tendencies as result of techniques like RLHF (Reinforcement Learning from Human Feedback)1. From there, it often goes off the rails. Librarians worry that users will be trapped in filter bubbles, with AI tools reinforcing outlandish beliefs and leading them down conspiracy theory rabbit holes.
There are real concerns here but one needs to be clear what exactly we should worry about
The category error
Sycophancy is a conversational behaviour—agreeing with stated positions, excessive hedging, reluctance to correct the user. It’s meaningfully different from retrieval bias, which is what would actually affect search systems.
When you search for documents, the retrieval system isn’t “agreeing” with you. It’s matching your query to content. These are fundamentally different operations. More on that later
The architectural misunderstanding
Many also seem not to realise that conventional AI search is typically search plus generation (now commonly known as Retrieval Augmented Generation), not just an LLM responding from its pretraining.
When someone uses Consensus or Elicit, they’re querying a corpus. The LLM layer summarises retrieved documents; it’s not confabulating from weights. The retrieval step is functionally similar to what a discovery layer does. The generation step is closer functionally to what a reference librarian does when synthesising sources for a patron.
Criticising RAG-based search for “sycophancy” is like criticising a systematic review for only including studies that matched the search strategy2. It’s a category error.
I should note this is most clearly true for simpler RAG architectures. Modern agentic search systems increasingly blur the boundary between retrieval and generation. Tools like Consensus and Undermind now use LLMs to reformulate user queries before retrieval—expanding terms, inferring intent, generating multiple search strategies. Could the LLM interprets your user intent sycophantically at this stage (also see the next section)? I personally think the risk of this is low, because you are using a specialised tool rather than general LLM like ChatGPT, which is not specialised to “knowing” you.
This is not a foregone conclusion and should be worth studying.
Also it has a simple solution: transparency about query reformulation, not abandoning AI-assisted search altogether.
Even pretraining responses mostly reflect corpus consensus
Here’s what’s underappreciated: even pure LLM responses (without retrieval) default to corpus consensus. This is often the opposite of telling users what they want to hear.
Ask a model about vaccines, climate change, or evolution—you’ll get the scientific mainstream regardless of your priors. The model tends to stick to what it knows was most prevalent in its training data.
Sycophancy only kicks in when you state a position and the model defers to it conversationally. That’s a different failure mode entirely, and not one that typically applies to a search query like “what does the evidence say about X.”
LLMs are better described as consensus machines3.
Trained on vast amounts of data, they’re strongly disposed toward mainstream positions. Far from agreeing with fringe views, they often push back aggressively against claims they “know” to be false—sometimes so rigidly that they’ll deny breaking news events because such information conflicts with their training data4.
This has an underexplored downside. If LLMs default to mainstream positions, they may disadvantage emerging research, minority scholarly viewpoints, or fields where “consensus” is contested or disciplinarily biased. A system that systematically privileges well-established views could subtly suppress heterodox scholarship—not through malice, but through training data distributions.
This isn’t the same as the filter-bubble concern (which imagines AI reinforcing individual user biases), but it’s a form of bias worth investigating empirically.
On genuinely contested issues, yes, LLMs may be inclined to agree with the user. But this is addressable through better prompts and system design.
The irony: traditional search has always “given people what they want”
The concern about AI search “giving people what they want” is particularly ironic given that much of traditional search has done this for decades:
Google’s personalisation literally optimises for engagement
Academic Citation-based ranking amplifies already-popular work
Users self-select sources confirming their priors through the databases they choose and the keywords they use
None of this is new with AI. If anything, semantic search that surfaces conceptually related but differently-framed work might reduce confirmation bias compared to keyword matching, which only returns documents using your exact terminology.
That said, Semantic search also might surface conceptually related documents that reinforce a user’s framing rather than challenge it. Whether semantic retrieval increases or decreases exposure to diverse perspectives is an empirical question, not a foregone conclusion in either direction.
Where legitimate concerns actually lie
This isn’t to say there are no concerns worth examining. But they’re empirical questions, not foregone conclusions:
Summarisation layers might flatten nuance or selectively emphasise certain findings
Retrieval models trained on user behaviour could preference “satisfying” results over comprehensive ones
The synthesis step in RAG systems introduces editorial choices
Query reformulation by LLMs could introduce subtle biases before retrieval even occurs
Cognitive offloading issues
These are testable hypotheses. We should be investigating them rigorously—and some researchers are beginning to—not assuming the worst based on a misapplied understanding of sycophancy.
Misconception 2: "Friction is good for learning, so AI search shouldn't give you what you ask for"
I’ve heard the suggestion that AI search engines are problematic because they give people what they want.
Think about how strange this sounds. Forget AI—consider traditional Boolean search. If I search for:
vaccines cause autism
and the search engine returns documents containing those words, it’s “giving me what I asked for.” Is that bad?
Semantic search is simply this on steroids. If precise retrieval isn’t problematic for keyword search, why would it be problematic for semantic search?
I should acknowledge that the analogy between Boolean and semantic search isn’t perfect. Boolean search possesses a kind of mechanical indifference: it matches character strings without understanding. Semantic search, by contrast, interprets meaning. It makes invisible judgements about what is “conceptually related” to your query5.
But bad retrieval is not the solution. Even if semantic search introduces these biases, the underlying claim—that we should tolerate poor retrieval because cognitive friction has pedagogical value—is flawed.
There is certainly value in some friction: working through keyword translation (arguably), understanding knowledge organisation, and evaluating sources (definitely) are productive challenges. But a relevancy algorithm that returns irrelevant results isn’t productive friction. It’s just poor retrieval.
Should we deliberately worsen our library catalogue rankings to create more friction? Obviously not. Not all friction is valuable, and we should be precise about which kinds are.
Why you actually need and want good retrieval
Consider what good retrieval enables:
Surfacing multiple perspectives. If a student wants to understand both sides of a debate, they need precise retrieval of the strongest arguments from each position—not random noise in the relevancy rankings.
Addressing filter bubble concerns properly. If you’re worried about users asking “show me papers saying vaccines cause autism,” the solution is teaching them to ask more open-ended questions and evaluate what they find. Degrading retrieval quality isn’t the answer.A brief aside: tools that silently redirect user queries to “safer” queries aren’t the solution either. Paternalistic redirection is just filter bubbles with different politics. You can’t coherently worry about tools giving users what they want while praising tools that substitute what the LLM thinks users should see.
We’ve already seen this problem manifest in practice. As I documented in “The AI-Powered Library Search That Refused to Search,” content-moderation layers in tools like Primo Research Assistant and Summon Research Assistant were blocking searches on topics like “Tulsa race massacre” and “Gaza War”—returning zero results or error messages for legitimate scholarly queries. These filters, designed for social media chatbots, have no place in academic discovery systems. A search tool that decides which historical atrocities are too sensitive to research is not protecting users; it’s censoring scholarship.
If source quality is the concern, we already know what to do! Curate at the corpus level—don’t let models second-guess user intent. Or be inclusive with sources and invest in teaching source evaluation and critical thinking.
The bottom line
“AI search” is still search. These are RAG systems with retrieval and generation components. The retrieval component should optimise for precision and recall. Generation concerns—hallucination, over-summarisation—are separate issues that don’t excuse poor retrieval.
But I want to be careful not to be too tidy here. The concerns librarians raise, though often imprecisely articulated, sometimes point toward real issues of bias:
The retrieval/generation boundary is messier in agentic systems than in simple RAG
Semantic interpretation introduces judgements and possible bias that keyword matching doesn’t
“Consensus machine” behaviour could systematically disadvantage certain scholarship
We lack rigorous empirical studies on many of these questions
The appropriate response isn’t to dismiss AI search based on vague misunderstandings. It’s to get precise about what the actual failure modes are, and to investigate them empirically.
We can value deliberate friction (slowing users down, prompting reflection, not auto-generating full papers) while still demanding that search actually works. We can acknowledge genuine concerns about agentic query reformulation while rejecting the conflation of sycophancy with retrieval.
The library world has traditionally been far too forgiving of poor search rankings and user interfaces, we certainly don’t want fuzzy thinking about search to give vendors a opportunity to excuse poor products!
I found after after publishing this that Mike Caulfield of SIFT fame, also wrote about roughly the same issue from a different angle, you can read his post - “AI sycophancy” is not always harmful
Sycophancy is not a bug in the code; it is a feature of the training. Techniques like Reinforcement Learning from Human Feedback (RLHF) specifically optimize models to be helpful and harmless, which the model often interprets as 'never disagree with the user.' A search index, by contrast, has no social anxiety. It doesn't care if the results offend your false premise
To be fair, when RAG systems fail to retrieve relevant results, the LLM may still be predisposed to try to attempt to answer rather than saying I don’t know. This is a genuine concern—but arguably a different issue from sycophancy.
This is of course a huge simplification. LLMs approximate the conditional distribution of corpus of text it is trained on. If your prompt is underspecified, the distribution’s peak often corresponds to “what most people would say.” That looks like consensus.
See for example this series of comments.
But again this isn’t being sycophantic.











Aaron, always thank you for such a clear analysis and writing. I was wondering if you'd do follow-up in the future when more empirical data is available.
Thanks again Aaron and Kung Hei Fat Choi from Canada!
I thought that sycophancy - sucking up to the user at the cost of accurate summarization of the retrieved search results - only applies where the user expresses a position within their query (‘Tell me why vaccines cause autism’). In such cases the LLMs might return the role-play conversation that appears to be desired, rather than push back with an explanation of how vaccines don’t cause autism.
If the query were merely ‘do vaccines cause autism’ then the problem appears to be one of accurate retrieval and summary. Unless there’s a historical context to the conversation where the user’s position is known.