Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

This paper reveals that enabling reasoning in large language models significantly enhances the recall of simple factual knowledge through two mechanisms—computational buffering and factual priming—while also demonstrating that hallucinating intermediate facts during this process increases final answer errors, a finding that can be leveraged to improve model accuracy by prioritizing hallucination-free reasoning trajectories.

Zorik Gekhman, Roee Aharoni, Eran Ofek, Mor Geva, Roi Reichart, Jonathan Herzig

Published Wed, 11 Ma
📖 4 min read☕ Coffee break read

Imagine you have a friend who is a walking encyclopedia. They know a million facts, but sometimes, when you ask them a simple question like "Who was the 10th King of Nepal?", they just... blank out. They know the answer is in their head, but they can't quite pull it out.

Now, imagine you tell this friend: "Before you answer, just take a moment to think out loud. Say whatever comes to mind, even if it's just rambling."

Surprisingly, this simple act of "thinking out loud" (which AI researchers call Reasoning) often unlocks the answer. But here's the twist: the questions aren't hard math problems or complex puzzles. They are simple facts. So, why does "thinking" help?

A new study from Google and Israeli universities dives into this mystery. They found that "thinking" helps in two specific ways, and one of them is a bit risky.

Here is the breakdown using simple analogies:

1. The "Warm-Up" Effect (Computational Buffer)

Think of your brain like a high-performance engine. When you ask a direct question, the engine might be cold, and the spark plug (the answer) doesn't fire immediately.

When the AI starts "thinking," it generates a bunch of words first. Even if those words are nonsense (like repeating "Let me think, let me think, let me think"), the act of generating them warms up the engine. It gives the AI's internal computer more time to run calculations in the background.

  • The Analogy: It's like a runner doing a few warm-up laps before the race. Even if the runner isn't sprinting yet, the extra movement gets the blood flowing and the muscles ready to perform. The study found that just having more text to process (even if it's gibberish) helps the AI access facts it couldn't reach before.

2. The "Rearranging the Bookshelf" Effect (Factual Priming)

This is the more interesting part. When the AI thinks, it doesn't just ramble; it often starts listing related facts.

  • The Analogy: Imagine your knowledge is a giant, messy library. You want to find a specific book (the answer). If you just ask for the book, the librarian (the AI) might miss it. But if the librarian starts shouting out titles of books near the one you want ("Oh, I remember a book about King Prithvi... and another about King Mahendra..."), those names act as breadcrumbs.

By listing the 1st through 9th Kings of Nepal, the AI "primes" its brain. It builds a semantic bridge. Once it has listed the first nine, the 10th one suddenly becomes much easier to find. The AI is essentially "self-retrieving" the answer by talking its way there.

The Danger Zone: The "Fake News" Trap

Here is the catch. Because the AI is generating these "breadcrumbs" (the related facts) itself, it can make mistakes.

  • The Analogy: Imagine the librarian is trying to help you find that book, but they are hallucinating. They say, "Oh, the 1st King was named 'Zog'." (That's fake). Then they say, "The 2nd was 'Zog's son'." (Also fake). By the time they get to the 10th King, they are so deep in their own made-up story that they give you the wrong answer.

The study found a scary pattern: If the AI lies during its "thinking" phase, it is much more likely to lie in the final answer. The "thinking" process can actually trap the AI in a web of its own hallucinations.

The Solution: The "Fact-Checker" Strategy

So, how do we fix this? The researchers suggest a smart strategy for using these AI models:

Instead of just taking the first answer the AI gives, we should look at its "thinking" process first.

  1. Check the thinking: Did the AI list some facts?
  2. Verify the facts: Are those facts true?
  3. Pick the winner: If the AI's "thinking" contains true facts and no lies, that's a high-quality answer. If the "thinking" is full of nonsense or lies, discard it and try again.

The Bottom Line

The paper teaches us that "thinking" isn't just for solving hard math problems. For simple facts, it acts like a mental warm-up and a memory bridge. However, we have to be careful because that bridge can collapse if the AI starts making things up.

By teaching AI to "think" correctly and checking its work before it speaks, we can unlock a whole new level of knowledge that was previously locked away inside the model.