On Multi-Step Theorem Prediction via Non-Parametric Structural Priors

This paper introduces a training-free, non-parametric approach to multi-step theorem prediction that overcomes the scalability limitations of vanilla in-context learning by leveraging Theorem Precedence Graphs to encode temporal dependencies and impose topological constraints, achieving state-of-the-art accuracy on the FormalGeo7k benchmark without gradient-based optimization.

Junbo Zhao, Ting Zhang, Can Li, Wei He, Jingdong Wang, Hua Huang

Published 2026-03-06
📖 5 min read🧠 Deep dive

The Big Problem: The "Lost in the Woods" Effect

Imagine you are trying to solve a very complex geometry puzzle. You have a giant library of 300 different rules (theorems) you can use, like "If two lines are parallel, then angles are equal."

To solve the puzzle, you need to pick the right rule, apply it, get a new fact, pick the next right rule, and so on. This is like walking through a dense forest where every tree represents a possible move.

The Old Way (Vanilla ICL):
Researchers tried to use powerful AI (Large Language Models) to solve this by just giving them a few examples and saying, "Here's how we solved similar problems before; now you do it."

  • The Analogy: It's like giving a tourist a map with no landmarks and saying, "Just guess the path."
  • The Result: For short paths (easy puzzles), the AI does okay. But as the puzzle gets longer and more complex (deep in the forest), the AI gets confused. It starts picking random rules that don't fit, like trying to use a "Pythagorean Theorem" when you need to prove two angles are equal. The AI gets lost, and its success rate crashes to near zero. The authors call this "Structural Drift." The AI forgets the logical order of the forest.

The New Solution: The "Smart Guidebook" (Pri-TPG)

The authors propose a new method called Pri-TPG. Instead of letting the AI wander blindly, they give it a dynamic, custom-made guidebook for every single puzzle.

Here is how it works, step-by-step:

1. The "Memory Lane" (Retrieval)

When a new geometry problem arrives, the system doesn't just look at the problem; it looks at its "doppelgangers" in a database of past solved puzzles.

  • Analogy: Imagine you are trying to fix a weird leak in your kitchen sink. Instead of guessing, you ask a smart librarian, "Who has fixed a sink like this before?" The librarian pulls out 200 photos of similar sinks that were successfully fixed.

2. The "Flowchart" (Theorem Precedence Graph)

The system takes those 200 past solutions and builds a flowchart (a directed graph). This flowchart shows the order in which rules were used.

  • Analogy: The librarian doesn't just give you a pile of photos. She draws a map showing: "First, you must tighten the valve (Rule A). Only AFTER that can you replace the washer (Rule B). You cannot replace the washer before tightening the valve, or it won't work."
  • This map is called the Theorem Precedence Graph (TPG). It tells the AI: "You are at Step 1. Here are the only 3 moves you are allowed to make next."

3. The "Bouncer" (Symbolic Executor)

The AI (the planner) picks a move from the allowed list. But before the move is accepted, a strict "Bouncer" (a symbolic solver) checks it.

  • Analogy: The AI says, "I want to use Rule B!" The Bouncer checks the current state of the puzzle. "Nope, you haven't tightened the valve yet. Rule B is blocked. Try Rule A."
  • This prevents the AI from making illegal moves that would break the logic chain.

4. The "Iterative Loop" (Step-by-Step)

The AI doesn't try to write the whole solution at once. It takes one step, gets checked by the Bouncer, updates its map, and takes the next step.

  • Analogy: It's like playing a game of chess where you make one move, the computer checks if it's legal, updates the board, and then you make the next move. You don't try to predict the whole game in one breath.

Why This is a Big Deal

  1. No "Schooling" Required: Most AI models need to be "trained" (studied for months) on specific math problems to get good at them. If the math library changes, you have to re-train them.

    • Pri-TPG is "Training-Free." It learns on the fly by looking at past examples. It's like a genius student who can solve a new type of math problem just by looking at a few similar examples from a textbook, without needing to go to summer school.
  2. Solving the "Long Chain" Problem: The biggest breakthrough is that this method works even for very long, difficult puzzles (6+ steps).

    • The Result: On a standard benchmark, the old "guessing" AI got about 26% of the hard problems right. The new Pri-TPG method got 89% right. It matched the performance of the most expensive, heavily trained super-AIs, but without the cost of training.

Summary in One Sentence

The paper teaches AI how to solve complex, multi-step logic puzzles by giving it a custom-made, step-by-step flowchart based on past solutions, acting as a strict guide that prevents the AI from getting lost in the forest of possibilities.