RetroReasoner: A Reasoning LLM for Strategic Retrosynthesis Prediction

RetroReasoner is a novel reasoning-based large language model for retrosynthesis prediction that combines supervised fine-tuning with structured disconnection rationales and reinforcement learning via round-trip accuracy to outperform existing methods by explicitly modeling strategic bond-disconnection thinking.

Hanbum Ko, Chanhui Lee, Ye Rin Kim, Rodrigo Hormazabal, Sehui Han, Sungbin Lim, Sungwoong Kim

Published 2026-03-16
📖 4 min read☕ Coffee break read

Imagine you are a master chef trying to figure out exactly how a famous, complex dish (like a multi-layered chocolate cake with gold leaf) was made. You have the finished cake in front of you, but the recipe is lost. Your goal is to work backward to discover the raw ingredients and the specific steps the original chef took to create it.

In the world of chemistry, this is called Retrosynthesis. It's the art of taking a finished molecule and figuring out what simpler chemicals were mixed together to build it.

For a long time, computers trying to solve this puzzle were like students who just memorized the answer key. They could guess the ingredients, but they didn't understand why those ingredients were chosen. They lacked the "strategic thinking" of a real chemist.

Enter RetroReasoner, a new AI model that doesn't just guess; it thinks like a chemist. Here is how it works, broken down into simple concepts:

1. The Problem: The "Black Box" Guessers

Previous AI models were like a magic 8-ball. You asked, "What are the ingredients?" and it gave an answer. Sometimes it was right, but often it was just a lucky guess or a generic guess based on patterns it had seen before. It couldn't explain its logic, and if the recipe was unusual (like a dish with a rare spice), it would often fail completely.

2. The Solution: Teaching the AI to "Think Aloud"

The researchers behind RetroReasoner realized that to solve this, the AI needs to follow a specific mental checklist, just like a human chemist does. They created a training framework called SyntheticRetro.

Think of SyntheticRetro as a "ghostwriter" for the AI. It takes millions of real chemical recipes and rewrites them into a step-by-step story. Instead of just saying "Mix A and B," it teaches the AI to say:

  • Step 1 (Product Analysis): "Look at this cake. It has a chocolate layer and a strawberry layer."
  • Step 2 (Finding the Weak Spot): "I see a seam where the chocolate meets the strawberry. That's the easiest place to pull them apart."
  • Step 3 (The Cut): "If I cut here, I get two separate pieces: a chocolate block and a strawberry block."
  • Step 4 (The Ingredients): "The chocolate block was likely made from cocoa and sugar. The strawberry block came from fresh strawberries and gelatin."

RetroReasoner learns this "thinking aloud" process. It doesn't just output the answer; it outputs the reasoning that leads to the answer.

3. The Training: The "Taste Test" (Round-Trip Accuracy)

How do you know if the AI's guess is actually good? In chemistry, you can't just check if the answer matches a textbook list, because there are often many different ways to make the same cake.

The researchers used a clever trick called Round-Trip Accuracy.

  • The Forward Trip: The AI guesses the ingredients (e.g., "Flour and Eggs").
  • The Return Trip: They feed those guessed ingredients into a different AI that acts like a forward-cooking simulator. It tries to "cook" the ingredients to see what dish it produces.
  • The Reward: If the simulated dish turns out to be the exact same cake you started with, the AI gets a high score (a reward). If it makes a mess or a different cake, it gets a low score.

This is like a game of "Telephone" where the message must come back to you perfectly. This forces the AI to find ingredients that are not just theoretically possible, but actually workable in a real lab.

4. The Results: Why It Matters

When tested, RetroReasoner was like a seasoned master chef compared to the previous "guessing" models.

  • It handles the weird stuff: When given a recipe with rare, strange ingredients (rare atoms or complex reactions), RetroReasoner didn't panic. Because it understands the strategy of cutting bonds, it could figure out how to build even the most bizarre molecules.
  • It offers more options: Instead of giving one single answer, it could suggest several different valid ways to make the molecule, giving human chemists more choices.
  • It's explainable: Because it writes out its reasoning steps, a human chemist can look at its work, say, "Ah, I see why it chose that cut," and trust the result.

The Big Picture

RetroReasoner is a bridge between raw data and human intuition. It teaches AI to stop memorizing answers and start understanding the logic of chemistry. By mimicking the strategic thinking of human experts and using a "cooking simulation" to verify its work, it promises to speed up the discovery of new medicines, materials, and chemicals, turning the complex puzzle of molecular building into a solvable game.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →