Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets

This paper argues that single-agent LLMs are more information-efficient than multi-agent systems for multi-hop reasoning under equal token budgets, demonstrating through empirical study and information-theoretic analysis that reported multi-agent advantages often stem from uncontrolled computation and context artifacts rather than inherent architectural benefits.

Dat Tran, Douwe Kiela

Published 2026-04-06
📖 4 min read☕ Coffee break read

Imagine you are trying to solve a very tricky puzzle, like a complex riddle that requires connecting three different clues to find the answer. You have two ways to tackle this:

  1. The Lone Genius (Single-Agent): One smart person sits at a desk, thinks hard, writes down their entire thought process, and solves it alone.
  2. The Committee (Multi-Agent): A group of people sits around a table. They pass notes back and forth, debate each other, split the work, and try to solve it together.

For a long time, everyone assumed the Committee was better because they had more "brainpower" and could discuss ideas. But this new paper asks a crucial question: Are they actually smarter, or are they just using more paper and ink?

The Big Discovery: It's About the Budget

The researchers realized that in previous studies, the Committee was almost always allowed to use way more paper (tokens) than the Lone Genius. The Committee could write 10 pages of notes while the Genius was only allowed 1 page. Of course, the Committee looked better! They had more space to think.

This paper forced them to use the exact same amount of paper (a fixed "thinking budget"). When they did this, the results were surprising:

The Lone Genius almost always won or tied with the Committee.

Why? The "Telephone Game" Analogy

The authors use a concept from information theory to explain this. Imagine you are playing the Telephone Game (where a message is whispered from person to person).

  • The Lone Genius hears the whole story once, thinks about it, and writes the answer. The information stays fresh and complete in their head.
  • The Committee has to whisper the story from Person A to Person B, then to Person C. Every time they pass a note, a little bit of the message gets lost, distorted, or forgotten. Even if they are all very smart, the act of passing the message around introduces "noise."

The paper argues that unless the Committee has a massive advantage (like having a much bigger budget to talk over), the Lone Genius is more efficient because they don't lose information in the middle of the conversation.

When Does the Committee Win?

The paper found one specific situation where the Committee shines: When the puzzle is messy.

Imagine the Lone Genius is trying to read a book, but someone has spilled coffee on the pages, torn out half the text, or written random nonsense words over the clues. The Genius gets confused and can't find the answer.

In this case, the Committee can help. Because they are splitting the work, one person can focus on cleaning up the coffee stains, another can ignore the nonsense words, and a third can double-check the facts. They can "filter out" the mess better than one person trying to do it all at once.

The Lesson: If the information is clear, a single smart brain is best. If the information is messy or broken, a team can sometimes fix it.

The "Hidden Trick" in the Results

The researchers also found a funny glitch in how some AI systems (specifically Google's Gemini) report their work.

  • When asked to "think for 10,000 steps," the Lone Genius would often stop writing after 300 steps, even though the computer said it used 10,000.
  • The Committee, however, would actually write out all 10,000 steps because they were passing notes between different "people."

This made the Committee look like they were doing much more work than they actually were. The paper suggests that many previous studies claiming "Teams are better" were actually just measuring "Teams who got to use more paper."

The Takeaway

If you want to solve a reasoning problem efficiently:

  1. Don't assume more people = better results. Often, a single, focused mind is more efficient if given the same resources.
  2. Watch out for "fake" effort. Just because an AI says it "thought" a lot doesn't mean it actually processed more information.
  3. Use a team only when things are messy. If the data is noisy or confusing, a team structure can help filter out the bad information.

In short: For clear thinking, one smart brain is usually better than a committee of brains passing notes, unless the notes are the only thing keeping the team from getting lost.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →