Thinking in Latents: Adaptive Anchor Refinement for Implicit Reasoning in LLMs

The paper introduces AdaAnchor, a latent reasoning framework that performs silent iterative computation by refining input-attached anchor vectors with an adaptive halting mechanism, thereby significantly improving accuracy and reducing inference costs compared to both verbose Chain-of-Thought prompting and fixed-step latent methods.

Disha Sheshanarayana, Rajat Subhra Pal, Manjira Sinha, Tirthankar Dasgupta

Published 2026-03-17
📖 4 min read☕ Coffee break read

Imagine you have a very smart but chatty friend (an AI) who is great at solving math problems. Usually, when you ask them a question, they don't just give you the answer. They write out a long, step-by-step diary of their thought process: "Okay, first I need to add these numbers, then I realize I made a mistake, so I subtract that, then I multiply..."

This is called Chain-of-Thought (CoT). It works well, but it's slow and expensive. It's like hiring a lawyer who charges by the word; the more they talk to explain their logic, the more it costs you, and the longer you have to wait for the verdict.

The Problem: Too Much Talking, Not Enough Thinking

Researchers noticed that sometimes, the AI doesn't need to write down every single thought to get the right answer. It just needs to "think" about it internally.

Previous attempts to fix this tried to make the AI "think silently" inside its own brain (in a hidden "latent space") without writing anything down. But there was a catch: The AI didn't know when to stop thinking.

It was like asking your friend, "Solve this, but think for exactly 10 minutes."

  • If the problem was easy (like 2+22+2), they wasted 9 minutes and 50 seconds just staring at the wall.
  • If the problem was hard, 10 minutes wasn't enough, and they gave up too soon.

The Solution: AdaAnchor (The "Smart Pause Button")

The paper introduces a new method called AdaAnchor. Here is how it works, using a simple analogy:

1. The "Mental Scratchpad" (Latent Anchors)

Instead of writing on a piece of paper (tokens), the AI uses a special mental scratchpad. Imagine a set of invisible sticky notes attached to the question.

  • Step 1: The AI looks at the question and writes a rough idea on the sticky notes.
  • Step 2: It looks at the notes, thinks, and rewrites the notes with a better idea.
  • Step 3: It repeats this, refining the notes over and over.

Crucially, none of this is written down for you to see. The AI is just updating these invisible notes in its head.

2. The "Stability Check" (Adaptive Halting)

This is the magic part. The AI has a built-in rule: "Stop thinking when the notes stop changing."

  • The Scenario: Imagine you are trying to solve a riddle. You keep guessing answers in your head.
    • Guess 1: "Maybe it's a cat?" (Notes change)
    • Guess 2: "No, maybe a dog?" (Notes change)
    • Guess 3: "Wait, it's definitely a dog." (Notes change slightly)
    • Guess 4: "Yeah, it's a dog." (Notes are exactly the same as before)

The moment the AI realizes, "Hey, my thoughts aren't changing anymore; I've found the answer," it hits the Stop Button.

  • Easy Problem: The AI thinks for 2 seconds, realizes it's done, and says the answer.
  • Hard Problem: The AI thinks for 20 seconds, refining its notes until they finally settle, then says the answer.

Why This is a Big Deal

The researchers tested this on math problems and found three amazing things:

  1. It's Faster and Cheaper: Because the AI stops talking (generating text) almost immediately, it saves about 92-93% of the "words" it usually has to type out. It's like going from a 10-page essay to a one-word answer, but with the same intelligence.
  2. It's Smarter: By letting the AI decide how long to think based on the difficulty of the problem, it actually gets more accurate than forcing it to think for a fixed amount of time. It spends more time on hard problems and less time on easy ones.
  3. It's Efficient: It saves money and energy because the computer doesn't have to process thousands of extra words that nobody reads.

The Bottom Line

AdaAnchor is like giving an AI a "smart pause button." Instead of forcing it to write a long diary entry for every problem, it lets the AI do its thinking silently on invisible sticky notes. It keeps refining those notes until they stop changing, then it just hands you the final answer.

This means we can have AI that thinks deeply and solves hard problems, but doesn't waste time or money chattering about how it did it.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →