Discovery of Interpretable Physical Laws in Materials via Language-Model-Guided Symbolic Regression

This paper introduces a framework that leverages large language models to guide symbolic regression, successfully discovering accurate, interpretable, and simplified physical laws for perovskite materials while drastically reducing the search space compared to traditional methods.

Original authors: Yifeng Guan, Chuyi Liu, Dongzhan Zhou, Lei Bai, Wan-jian Yin, Jingyuan Li, Mao Su

Published 2026-02-27
📖 4 min read☕ Coffee break read

This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you are trying to figure out the secret recipe for the perfect chocolate cake. You have a huge list of ingredients: flour, sugar, eggs, salt, vanilla, cocoa, baking powder, and maybe even some weird stuff like glitter or motor oil.

The Problem: The "Blind Walk"
Traditional scientists (using old-school math methods called Symbolic Regression) try to find the recipe by mixing and matching every single ingredient in every possible combination. They might try "Motor Oil + Sugar" or "Glitter + Eggs."

  • The Issue: This is like walking through a massive library blindfolded, looking for one specific book. It takes forever, and you might accidentally write down a recipe that tastes okay but makes no sense physically (like "add 5 gallons of motor oil"). The math works, but the physics is nonsense.

The New Solution: The "Smart Librarian"
This paper introduces a new tool called LangLaw. Think of LangLaw as a Super-Intelligent Librarian (a Large Language Model, or AI) who has read every science textbook ever written.

Instead of letting the blindfolded walker search the whole library, the Librarian steps in first.

  1. The Librarian's Job: Before the search begins, the Librarian looks at the ingredients and says, "Hey, we don't need motor oil or glitter for a cake. Let's ignore those. Also, we know that flour and sugar are the main players, so let's focus on those."
  2. Guiding the Search: The Librarian gives the blindfolded walker a tiny, focused map of just the "Flour and Sugar" section of the library.
  3. The Result: The walker finds the perfect recipe much faster. The recipe isn't just accurate; it makes sense. It tells you why the cake rises (because of the baking powder), not just that it does.

How It Works in the Real World (Materials Science)

The researchers tested this "Smart Librarian" on three tricky problems in materials science (making new types of rocks and metals):

  1. How hard is the rock? (Bulk Modulus)

    • Old Way: Tried thousands of random math formulas. Some were accurate but looked like gibberish.
    • LangLaw Way: The AI knew that "how much an atom wants to steal an electron" matters. It guided the math to find a simple, clean formula that explains why some rocks are soft and others are hard.
  2. How much light can the material absorb? (Band Gap)

    • Old Way: Created a super-complex equation with 10 different parts that was hard to understand.
    • LangLaw Way: Found a much shorter, simpler equation that did the exact same job. It's like finding a shortcut that saves you 90% of the walking time.
  3. How good is the material at making fuel? (OER Activity)

    • Old Way: Needed a massive amount of data to learn, and often failed when given new, rare materials.
    • LangLaw Way: Even with very little data (like having only 18 cake recipes to learn from), the AI used its "common sense" (scientific knowledge) to predict how new materials would behave. It was twice as good at guessing new materials as the best deep-learning computers.

Why This Matters

  • Speed: It reduced the search space by a factor of 100,000. Imagine searching for a needle in a haystack, but the AI tells you, "The needle is actually in this tiny box right here."
  • Understanding: It doesn't just give you a number; it gives you a story. It explains the physical rules behind the material, not just the result.
  • Small Data: It works even when we don't have millions of data points (which is common in expensive science experiments).

In a Nutshell:
LangLaw is like giving a brilliant, knowledgeable professor (the AI) a team of hardworking students (the math algorithms). The professor tells the students what to look for and what to ignore, so they don't waste time on nonsense. The result is a discovery that is not only correct but also easy for humans to understand and use to build better materials.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →