← Latest papers
⚛️ high-energy theory

Learning the S-matrix from data: Rediscovering gravity from gauge theory via symbolic regression

This paper demonstrates that symbolic regression applied to numerical on-shell data can autonomously rediscover fundamental analytic structures in scattering amplitudes, including KLT, Kleiss-Kuijf, and BCJ relations, thereby establishing a data-driven strategy for uncovering hidden theoretical connections like the gravity-gauge duality without relying on prior group-theoretic knowledge.

Original authors: Nathan Moynihan

Published 2026-02-18
📖 6 min read🧠 Deep dive

Original authors: Nathan Moynihan

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you are a detective trying to solve a massive, cosmic mystery. The "crime scene" is the subatomic world, where particles crash into each other and bounce off like billiard balls. Physicists call these collisions scattering amplitudes. They are complex mathematical recipes that tell us the probability of a particle doing one thing versus another.

For decades, physicists have been writing these recipes by hand, using incredibly difficult math. But recently, they started asking: What if we just let a computer look at the data and figure out the recipe for us?

This paper is about a team of researchers who tried exactly that. They used a special type of Artificial Intelligence called Symbolic Regression to rediscover some of the most famous "laws of the universe" regarding how particles interact, specifically how Gravity (the force that holds planets together) is secretly related to Gauge Theory (the force that holds atoms together).

Here is the story of how they did it, explained with simple analogies.

1. The Problem: The "Black Box" vs. The "Recipe Book"

Most modern AI (like the chatbots you use) is a Black Box. You give it a question, and it gives you an answer. It's great at guessing, but it doesn't tell you how it got there. It's like a chef who makes a delicious cake but refuses to tell you the ingredients or the steps. In physics, we don't just want the answer; we want the recipe (the mathematical formula) so we can understand the underlying laws of nature.

Symbolic Regression is different. Instead of just guessing numbers, it tries to find the actual equation. It's like a detective who doesn't just say "the butler did it," but actually writes out the step-by-step logic of how the butler did it, using only the clues found at the scene.

2. The Setup: The "Lego" of the Universe

The researchers focused on two types of particles:

  • Gluons: The "glue" particles that hold atoms together (Gauge Theory).
  • Gravitons: The hypothetical particles that carry gravity.

There is a famous, beautiful relationship between them called the KLT Relation. It's like a secret code that says: "If you take two sets of Gluon recipes, mix them together with a specific spice (called a Mandelstam invariant), you get the Gravity recipe."

The goal of the paper was to see if a computer could look at thousands of numbers representing Gluon collisions and figure out this secret code on its own, without the physicists telling it the code exists.

3. The Method: The "Data Shrinker"

The computer was fed a massive amount of data:

  • The Ingredients: Lists of numbers representing the energy and direction of particles (Mandelstam invariants).
  • The Output: The results of Gluon collisions.
  • The Target: The results of Gravity collisions.

The Challenge: The data was messy. It was like trying to find a specific sentence in a library where every book has been shredded and mixed together. There were too many redundant numbers (features).

The Solution (CPQR): The researchers used a mathematical tool called Column-Pivoted QR factorization.

  • Analogy: Imagine you have a giant pile of ingredients for a soup. Some are just watered-down versions of others (e.g., "salt" and "salty water"). The computer acts like a smart sous-chef who looks at the pile and says, "We don't need all these; we just need these 5 specific spices to make the soup."
  • By removing the redundant data, the computer automatically rediscovered two famous mathematical rules (KK and BCJ relations) that physicists had already known. The computer found them just by looking for patterns in the numbers, proving it could "think" like a physicist.

4. The Discovery: Re-inventing Gravity

Once the data was cleaned up, the Symbolic Regression engine went to work. It started mixing and matching the remaining ingredients (Gluon results and energy numbers) to see what combination produced the Gravity result.

  • At 4 particles: It was easy. The computer found the formula almost instantly. It rediscovered the KLT relation, effectively saying, "Aha! Gravity is just Gluons multiplied by each other and some energy numbers!"
  • At 5 particles: It got harder, but the computer still found the answer. It took a bit longer, but it successfully wrote down the complex formula that connects the two forces.
  • At 6 particles: The computer hit a wall. The number of possible combinations exploded. It's like trying to solve a Rubik's cube that keeps getting bigger every time you turn a side. The computer got overwhelmed by the sheer number of possibilities (a "combinatorial explosion") and couldn't find the simple answer in the time allowed.

5. The Comparison: The "Translator" vs. The "Detective"

The paper also compared their method (Symbolic Regression) to a newer method using Neural Networks (Deep Learning).

  • The Neural Network (The Translator): Imagine you have a long, complicated sentence in a foreign language. A Neural Network is like a translator that has read millions of books. It can look at the long sentence and instantly spit out a short, simple version. It's great at simplifying things it has seen before, but it might "hallucinate" (make up a sentence that looks right but is wrong).
  • Symbolic Regression (The Detective): This method doesn't know the answer beforehand. It looks at the raw data points (the "clues") and builds the formula from scratch. It's slower and needs help to know which clues are important, but the result is a proven, verifiable equation. If the math checks out on new data, it's definitely correct.

The Big Takeaway

This paper is a proof of concept. It shows that AI can be a partner in discovery, not just a calculator.

  • What worked: The AI successfully "re-discovered" the deep connection between Gravity and Particle Physics using only raw numbers, without being told the rules.
  • What's next: The AI is currently stuck on the "6-particle" problem because the math gets too messy. The authors suggest a hybrid future: use the Neural Network to clean up the messy data (like a translator simplifying a text), and then use Symbolic Regression to find the final, perfect formula (like a detective solving the case).

In short, the researchers taught a computer to look at the chaos of particle collisions and whisper back the elegant, hidden laws of the universe. It's a step toward a future where we don't just calculate the universe, but let the universe teach us its own secrets.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →