The Stepwise Informativeness Assumption: Why are Entropy Dynamics and Reasoning Correlated in LLMs?

This paper proposes and empirically validates the Stepwise Informativeness Assumption, which explains the correlation between internal entropy dynamics and reasoning correctness in large language models by demonstrating that valid reasoning traces accumulate answer-relevant information in expectation through maximum-likelihood and reinforcement learning training.

Mar Gonzàlez I Català, Haitz Sáez de Ocáriz Borde, George D. Montañez, Pietro Liò

Published 2026-04-09
📖 5 min read🧠 Deep dive

The Big Mystery: Why Does "Confusion" Mean "Thinking"?

Imagine you are watching a detective solve a crime in a movie.

  • The Bad Detective: He guesses wildly, gets confused, changes his mind constantly, and ends up with a wrong answer. His internal monologue is chaotic and full of "maybe this, maybe that."
  • The Good Detective: He starts with many possibilities, but with every clue he finds, he crosses off the wrong ones. His internal monologue becomes quieter and more focused. By the end, he is 100% sure of the answer.

In the world of Large Language Models (LLMs), researchers have noticed a strange pattern: When a model's "internal confusion" (called Entropy) goes down, it usually means it's getting the right answer.

But here is the puzzle: The model doesn't know the "right answer" while it is thinking. It only knows what words it has written so far. So, why does its internal feeling of "I'm getting closer" match the external reality of "I'm right"?

The Solution: The "Stepwise Informativeness Assumption" (SIA)

The authors of this paper propose a simple rule to explain this magic. They call it the Stepwise Informativeness Assumption (SIA).

Think of reasoning like hiking down a mountain to find a hidden treasure.

  1. The Map (The Model): The model is the hiker.
  2. The Treasure (The Answer): The correct answer is at the bottom of the mountain.
  3. The Fog (Entropy): The fog represents how confused the hiker is about where the treasure is. High fog = lost. Low fog = close.

The SIA Rule says:

"Every step the hiker takes (every word the model writes) should, on average, clear away a little bit of the fog and point toward the treasure."

If the hiker is taking steps that don't clear the fog (like walking in circles or heading the wrong way), they are hallucinating or "overthinking." But if the hiker is taking steps that do clear the fog, they are reasoning correctly.

How Do Models Learn to Do This?

You might ask, "Do models naturally know how to clear the fog?"

No. The paper explains that models learn this through training, much like a student learning to study for a test.

  • Pre-training (Reading the Library): The model reads millions of books. It learns to predict the next word. Sometimes it learns to solve math problems, but mostly it just learns to sound like a human. At this stage, it's like a student who has read a lot but hasn't been taught how to solve a specific problem. They might guess the answer, but their "fog" doesn't necessarily clear up in a logical way.
  • Fine-Tuning (The Tutor): Then, humans show the model examples of problems with the correct step-by-step solutions. The model is rewarded for following the path that leads to the right answer.
    • The Analogy: Imagine a tutor telling the student, "Don't just guess! Look at the clues you found in step 1; they tell you exactly what to do in step 2."
    • The model learns that to get the reward (the correct answer), it must write steps that accumulate information. It learns that every sentence it writes should make the final answer more obvious.
  • Reinforcement Learning (The Coach): Finally, the model practices on its own. If it gets the answer right, the coach gives a high-five. If it gets it wrong, the coach says "try again." This reinforces the habit of clearing the fog step-by-step.

The "Signatures" of a Good Thinker

The paper found that when a model has learned this "SIA" habit, its internal "fog" behaves in very specific ways that we can measure:

  1. Early Lock-in: A smart model clears the fog early. It figures out the direction quickly. A confused model keeps the fog thick for a long time, wandering around.
  2. The Plateau: Once a smart model finds the treasure, the fog disappears completely (it hits zero). A confused model might get stuck in a "foggy valley" where the fog stops getting thinner, but it still hasn't found the treasure.
  3. The Shuffle Test: The researchers tried to scramble the order of the words in a "good" reasoning chain. Suddenly, the magic disappeared! The model looked confused again. This proves that the order of the words matters. The steps must build on each other like a ladder, not just be a pile of bricks.

Why Should We Care?

This isn't just about math puzzles. This discovery gives us a flashlight to see inside the "black box" of AI.

  • Detecting Hallucinations: If an AI is writing a long story but its internal "fog" isn't getting thinner, we know it's making things up, even if the words sound fancy.
  • Stopping Early: If the fog has cleared completely, we can tell the AI, "Okay, you've got it, stop talking!" This saves money and time.
  • Building Better AI: We now know that to make smarter AI, we shouldn't just feed them more data; we need to train them specifically to write steps that accumulate information about the answer.

Summary

The paper solves the mystery of why AI "confidence" matches "correctness." It turns out that when we train AI well, we teach it a simple rule: "Every word you write should make the answer a little bit clearer." When this rule is followed, the model's internal confusion drops exactly when it gets the answer right.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →