Not All Queries Need Deep Thought: CoFiCot for Adaptive Coarse-to-fine Stateful Refinement

The paper proposes CoFiCot, an adaptive coarse-to-fine framework that dynamically allocates test-time computation by triaging queries based on multi-metric difficulty assessment and applying stateful, context-aware refinement to balance efficiency and reasoning accuracy.

Dongxu Zhang, Hongqiang Lin, Yiding Sun, Pengyu Wang, Qirui Wang, Ning Yang, Jihua Zhu

Published Tue, 10 Ma
📖 4 min read☕ Coffee break read

Imagine you are a manager at a busy call center. Your goal is to solve every customer's problem perfectly.

In the past, your company had a strict rule: "Every call gets exactly 10 minutes of your time."

This rule caused two big problems:

  1. The "Over-Thinker" Problem: If a customer just asked, "What's 2+2?", you spent 10 minutes agonizing over it. You might get confused, change your answer to "3," then "5," and finally give the wrong answer because you over-complicated a simple thing.
  2. The "Under-Thinker" Problem: If a customer asked a complex question like, "How do I fix a broken engine while calculating fuel efficiency?", 10 minutes wasn't enough. You ran out of time, gave up halfway, and gave a wrong answer because you didn't have enough time to finish the job.

This is the exact problem Large Language Models (AI) face today. They use the same amount of "brain power" for every question, whether it's easy or hard.

The paper introduces a new system called CoFiCot (Coarse-to-fine Adaptive Reasoning). Think of it as hiring a smart, adaptive supervisor who changes the strategy based on the difficulty of the call.

Here is how CoFiCot works, broken down into simple steps:

1. The "Triage" (The Quick Glance)

Before the AI starts solving the problem, it takes a quick look at the question and generates a few rough drafts of answers.

  • The Supervisor's Job: It checks these drafts like a doctor triaging patients.
    • Are all the drafts agreeing? (High Confidence)
    • Do the drafts look high quality? (Reliability)
    • Does the question look like it needs a long explanation? (Complexity)

Based on this, the supervisor sorts the questions into three bins: Easy, Medium, and Hard.

2. The "Differentiated Strategy" (Tailored Solutions)

Now, the AI treats these bins differently:

  • 🟢 The Easy Bin (The "Coffee Break" Zone):
    If the question is simple (like "What's 2+2?"), the supervisor says, "Stop! You already have the right answer. Don't waste time thinking more."
    The AI just picks the best answer from the first few drafts and moves on. This prevents the AI from "over-thinking" and accidentally messing up a simple answer.

  • 🔴 The Hard Bin (The "Deep Dive" Zone):
    If the question is complex (like a tricky math puzzle), the supervisor says, "This is tough. We need to work harder."
    The AI enters a correction loop. It doesn't just guess again; it tries to fix its mistakes step-by-step.

3. The "Stateful Correction" (The "Don't Erase the Whole Blackboard" Trick)

This is the most clever part of the paper.

Imagine a student solving a math problem on a blackboard.

  • Old AI Method (Stateless): If the student makes a mistake in Step 3, the old AI would erase the entire blackboard and start writing the whole problem from Step 1 again. This is slow and often leads to new mistakes because the student forgets the logic they had in Step 1.
  • CoFiCot Method (Stateful): If the student makes a mistake in Step 3, CoFiCot says, "Wait! Step 1 and Step 2 were perfect. Keep them exactly as they are." It only erases Step 3, fixes it, and then writes Step 4 based on the new Step 3.

This is called Stateful Sequential Correction. It ensures that the AI remembers what it got right and only fixes what is broken, keeping the whole chain of logic connected and strong.

4. The "Quality Control" (The Reward Models)

To make sure the AI is actually fixing things and not just guessing, CoFiCot uses two special "judges":

  • The Step-Judge (PRM): Checks every single step of the reasoning. "Is this math right? Is this logic sound?"
  • The Final-Judge (ORM): Looks at the whole answer at the end. "Is this the best possible solution?"

If the Step-Judge finds a mistake, the AI fixes it. If the Final-Judge says the answer is great, the AI stops and submits the answer.

Why is this a big deal?

  • Saves Money & Time: It doesn't waste computer power on easy questions.
  • Smarter Results: It gives complex questions the deep thinking they need without getting confused.
  • No More "Over-thinking": It stops the AI from turning a simple "Yes" into a confusing "Maybe, but actually no..."

In a nutshell: CoFiCot is like a smart manager who knows when to let an employee relax on a simple task and when to step in and help them fix a specific error on a hard task, without making them start the whole project over again. It makes AI reasoning faster, cheaper, and much more accurate.