Sample-efficient evidence estimation of score based priors for model selection

This paper proposes \method, a sample-efficient estimator that leverages intermediate samples from the reverse diffusion sampling process to accurately compute model evidence for diffusion priors, enabling effective model selection and prior misfit diagnosis in ill-posed imaging inverse problems without requiring extensive prior density evaluations.

Frederic Wang, Katherine L. Bouman

Published 2026-02-25
📖 5 min read🧠 Deep dive

Imagine you are a detective trying to solve a blurry, distorted photo of a crime scene. You have the blurry photo (the measurement), but you need to figure out what the original, clear scene looked like (the image).

In the world of science and engineering, this is called an inverse problem. The problem is that the blurry photo could have come from many different clear scenes. To pick the right one, you need a "rulebook" or a "gut feeling" about what a crime scene usually looks like. In math, this rulebook is called a Prior.

The Big Problem: Choosing the Right Rulebook

Usually, scientists just pick a rulebook they think is good. But what if they pick the wrong one?

  • If you use a rulebook that says "crime scenes are always in forests," but the crime happened in a city, your detective work will be biased and wrong.
  • If you use a rulebook that says "crime scenes are always in cities," but it happened in a forest, you'll be wrong again.

The ideal solution is to ask: "Which rulebook is most likely to have produced this specific blurry photo?" In math, this is called calculating the Model Evidence.

The Old Way: The Impossible Math

For a long time, calculating this "Model Evidence" was like trying to count every single grain of sand on a beach to find one specific grain. It required so much computing power that it was impossible for the most advanced AI models (called Diffusion Models) that scientists use today.

Diffusion models are like a master painter who can take a canvas covered in static noise and slowly, step-by-step, turn it into a clear picture. They are amazing at filling in the blanks. But, because they are so complex, we couldn't easily ask them, "How likely is it that you created this specific blurry photo?"

The New Solution: DiME (Diffusion Model Evidence)

The authors of this paper, Frederic Wang and Katherine Bouman, invented a new tool called DiME.

Here is how DiME works, using a simple analogy:

The Analogy: The Hiking Trail

Imagine the Diffusion Model is a hiker walking down a mountain trail from the peak (pure noise) to the valley (the clear image).

  1. The Old Way: To know how likely the hiker was to end up at a specific spot, you had to stop them at every single step, measure the wind, the slope, and the mud, and do a massive calculation. It took forever and often gave wrong answers.
  2. The DiME Way: DiME is like a smart observer who just watches the hiker's path.
    • As the hiker walks down, they naturally leave a trail of footprints (intermediate samples).
    • DiME doesn't need to stop the hiker or do complex math. It just looks at the distance between the hiker's path and the "expected" path of a random walker.
    • If the hiker's path stays very close to the expected path, the rulebook (Prior) is a good fit.
    • If the hiker has to take a weird, winding detour to get to the spot, the rulebook is a bad fit.

By simply measuring the "detours" the hiker takes, DiME can calculate the Model Evidence with just 20 steps (samples), whereas other methods needed thousands.

What Did They Prove?

The authors tested DiME in three ways:

  1. The Math Test: They used a simple, known math problem where the answer was already written down. DiME got the answer almost perfectly, beating all the old, heavy-duty methods.
  2. The "Guess the Digit" Test: They showed the AI a blurry, noisy picture of a handwritten number (like a '6' or a '9'). They had 10 different rulebooks (one for each digit 0-9).
    • Old methods often guessed the wrong digit because they got confused by the noise.
    • DiME correctly identified the digit every single time, even when the image was very blurry.
  3. The Black Hole Test (The Real Deal): This is the coolest part. They used DiME on real data from the Event Horizon Telescope, which took the first picture of a black hole (M87*).
    • They had different rulebooks: one based on black hole physics, one based on general space photos, one based on human faces, and one based on handwritten digits.
    • DiME's Verdict: It correctly said, "The rulebook based on Black Hole Physics is the only one that makes sense for this photo." It even told them that the photo of the black hole fits perfectly within the laws of physics they used to create the rulebook.

Why Does This Matter?

Before DiME, scientists using these powerful AI models had to guess which "rulebook" to use. If they guessed wrong, their scientific conclusions could be biased or wrong.

DiME gives scientists a "truth meter."

  • It allows them to select the best AI model for a specific job.
  • It allows them to validate if their physical theories (like how black holes work) actually match reality.
  • It does all this quickly and efficiently, using the "footprints" the AI leaves behind while it works.

In short, DiME turns a black box into a transparent one, letting us trust the AI's answers in critical scientific discoveries like imaging black holes.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →