LeanTutor: Towards a Verified AI Mathematical Proof Tutor

This paper presents LeanTutor, a proof-of-concept AI system that combines Large Language Models with the Lean theorem prover to provide verified mathematical proof tutoring, evaluated using the newly introduced PeanoBench dataset.

Manooshree Patel, Rayna Bhattacharyya, Thomas Lu, Arnav Mehta, Niels Voss, Narges Norouzi, Gireeja Ranade

Published 2026-03-05
📖 3 min read☕ Coffee break read

Imagine you are trying to learn how to write a perfect, unbreakable legal contract. You have two potential teachers, but both have a major flaw:

  1. The Chatty Friend (The LLM): This teacher is incredibly friendly, speaks your language fluently, and can explain complex ideas with great stories. However, they are a bit of a daydreamer. They often make up facts, get the details wrong, and their contracts are full of hidden loopholes. They think they are right, but they aren't always.
  2. The Robot Judge (The Theorem Prover): This teacher is a machine that never lies. If they say a contract is valid, it is 100% mathematically guaranteed to be correct. But here's the catch: they only speak in a secret, robotic code that is incredibly hard for humans to understand. If you ask them a question in plain English, they just stare at you blankly.

The Problem: Students need the friendliness of the Chatty Friend to learn, but they need the absolute accuracy of the Robot Judge to be safe.

The Solution: LeanTutor
The paper introduces LeanTutor, a new system that acts like a super-powered translator and coach, combining the best of both worlds. Think of it as a bilingual tour guide who speaks both "Human" and "Robot."

LeanTutor works like a three-person dream team:

  • The Translator (Autoformalizer/Proof-Checker): When you write a math idea in plain English, this module instantly translates it into the Robot Judge's secret code. It then asks the Robot, "Is this actually true?" If the Robot says "No," the Translator tells you exactly where your logic broke.
  • The Coach (Next-Step Generator): Instead of just saying "Wrong," this module acts like a helpful hiking guide. If you are stuck on a mountain path, it doesn't carry you up; it points to the next safe step you should take to keep climbing. It suggests the next logical move in the proof.
  • The Storyteller (Natural Language Feedback Generator): Once the Robot Judge has checked the math, this module takes the cold, robotic "Error Code 404" and turns it into a warm, encouraging explanation in plain English. It tells you why you made a mistake and how to fix it, just like a human tutor would.

The Test Drive: PeanoBench
To see if this team actually works, the authors built a training ground called PeanoBench. Imagine a gym with 371 specific math puzzles (based on the rules of counting numbers). They created a dataset where every puzzle has both a "human story" version and a "robot code" version. This allows them to test if LeanTutor can successfully guide a student from the story to the correct robot code without making mistakes.

In a Nutshell
LeanTutor is an AI tutor that uses the brain of a human-friendly chatbot to talk to you, but uses the soul of a strict math robot to check your work. It ensures that when you learn math, you are learning the truth, not just a convincing story.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →