Symbolic Discovery of Stochastic Differential Equations with Genetic Programming

This paper introduces a genetic programming-based method for the symbolic discovery of stochastic differential equations that jointly optimizes drift and diffusion functions via maximum likelihood estimation, enabling the accurate, scalable, and interpretable modeling of noisy dynamical systems.

Sigur de Vries, Sander W. Keemink, Marcel A. J. van Gerven

Published Wed, 11 Ma
📖 4 min read☕ Coffee break read

Imagine you are a detective trying to figure out how a complex machine works, but you can only see the machine's output on a screen, and the screen is covered in static (noise).

Most scientists try to ignore the static, assuming the machine follows a perfect, predictable path. They try to draw a single, smooth line to explain the movement. But in the real world, things are messy. A stock market doesn't just go up or down; it jitters. A neuron in the brain doesn't just fire; it sparks randomly.

This paper introduces a new detective tool called GP-SDE (Genetic Programming for Stochastic Differential Equations). Here is how it works, explained simply:

1. The Problem: The "Noisy" Machine

Imagine you are watching a drunk person walking home.

  • The Drunk's Intent (Drift): They want to walk straight to their front door. This is the predictable part.
  • The Stumbles (Diffusion/Noise): But they are also tripping over cracks in the sidewalk, swaying in the wind, and bumping into people. This is the random, chaotic part.

Old methods tried to guess the path by ignoring the stumbles or treating them as a mistake. This paper says: "No! The stumbles are part of the story. We need to write a rule for how they stumble, not just where they are going."

2. The Tool: Genetic Programming (The "Evolutionary Chef")

The authors use a method called Genetic Programming. Think of this as a cooking competition where the chefs are computer programs.

  • The Ingredients: The computer has a library of math ingredients (plus, minus, multiply, divide, sine, cosine, etc.).
  • The Recipe: It randomly mixes these ingredients to create thousands of different "recipes" (mathematical equations) to describe the drunk person's walk.
  • The Taste Test (Fitness): It tests these recipes against the real video footage.
    • If a recipe predicts the path perfectly, it gets a high score.
    • If it fails, it gets a low score.
  • Evolution: The best recipes "mate" (swap parts of their code) and "mutate" (change a random ingredient) to create even better recipes for the next round. Over time, the computer evolves a perfect recipe that explains both the walking and the stumbling.

3. The Big Breakthrough: Cooking Two Dishes at Once

Usually, these computer chefs only try to write a recipe for the walking (the drift). They assume the stumbling is just random error they can't explain.

This paper's innovation is teaching the chef to cook two dishes simultaneously:

  1. The Drift Dish: The rule for where the system wants to go.
  2. The Diffusion Dish: The rule for how it stumbles and sways.

By learning both at the same time, the computer gets a much clearer picture of reality. It's like realizing that the drunk person isn't just "bad at walking," but that their stumbling follows a specific pattern based on how fast they are moving or how tired they are.

4. Why This is Better Than the Old Way

The old way of doing this (called Kramers-Moyal expansion) is like trying to sort a massive pile of mixed-up Lego bricks by dumping them into buckets based on size.

  • The Bucket Problem: If you have a simple 1D problem, the buckets work fine. But if you have a complex 20-dimensional system (like a weather model with 20 different variables), you need so many buckets that you run out of space, and the method crashes. It's slow and messy.

The new GP-SDE method doesn't use buckets. It builds the recipe directly.

  • Scalability: It handles complex, high-dimensional systems (like the 20-variable weather model) without getting overwhelmed.
  • Sparse Data: Even if you only have a few blurry snapshots of the drunk person (sparse data), this method can "fill in the gaps" by simulating the steps between the photos, making it very robust.

5. The Superpower: Generative Sampling

Because the new method learns the "stumbling rules" (the noise), it can do something the old methods can't: It can generate new, realistic scenarios.

  • Old Method: "Here is the average path the drunk person took." (One line).
  • New Method: "Here are 50 different possible paths the drunk person could take, all looking realistic with their own unique stumbles and sways."

This is crucial for scientists. If you are modeling a virus spread or a financial crash, you don't just want the average outcome; you want to see the range of possible disasters to prepare for them.

Summary

This paper gives scientists a smarter, more flexible way to decode the laws of nature in a noisy world. Instead of ignoring the chaos, they teach computers to evolve mathematical formulas that explain both the order and the chaos, allowing them to predict the future with much greater accuracy and creativity.