Gaussian Process Eigenmodes for Statistical and Systematic Uncertainties in Template Fits

This paper proposes replacing traditional per-bin Barlow-Beeston factors and interpolation modifiers with a unified eigenmode basis derived from log-Gaussian Cox process posteriors to efficiently model both statistical and systematic uncertainties in LHC template fits, thereby reducing dimensionality while preserving or bounding the original variance.

Original authors: Vincent Alexander Croft

Published 2026-05-20
📖 5 min read🧠 Deep dive

Original authors: Vincent Alexander Croft

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you are trying to find a tiny, rare gem (a new particle) hidden inside a massive, noisy pile of sand (background data) at a giant particle collider. To do this, physicists use a "template"—a map of what the sand pile should look like if no gem is there. They compare their actual observations to this map. If the real pile has a weird bump that the map doesn't predict, that might be the gem.

The problem is that making this map is tricky. The map is built from computer simulations (Monte Carlo), which are like taking a limited number of photos of the sand. If you don't have enough photos, the map gets grainy and full of "static" (statistical noise). If you try to make the map too detailed to see the gem clearly, the static gets so loud you can't trust the map at all.

This paper proposes a new way to build that map using Gaussian Processes (GPs), which is a fancy mathematical way of saying "smooth, intelligent guessing."

Here is the breakdown of the paper's ideas using simple analogies:

1. The Old Way: The "Pixelated" Map

Traditionally, physicists build their map by dividing the data into tiny boxes (bins) and counting the sand in each box.

  • The Problem: If you have a limited number of simulation photos, some boxes will be empty or have very few grains. To handle the uncertainty of these empty boxes, the old method adds a "wobble factor" (a nuisance parameter) to every single box.
  • The Consequence: If you have a 3D map with millions of boxes, you end up with millions of wobble factors. It's like trying to steer a ship by adjusting a separate rudder for every single plank of wood. It's computationally heavy, and when the data is scarce, the map becomes so shaky that it might hide the gem or create fake ones.

2. The New Way: The "Smooth River" Map

The authors suggest replacing the pixelated boxes with a smooth, flowing river (a mathematical function). Instead of counting grains in boxes, they use a Gaussian Process to draw a smooth curve that fits the sand data.

  • The Magic: Because the curve is smooth, it "knows" that if one part of the river is high, the neighbors are likely high too. It borrows strength from its neighbors.
  • The Result: Even with very few photos (low statistics), the map stays smooth and reliable. It doesn't get grainy. The paper proves mathematically that this smooth map is always more precise (has less uncertainty) than the old pixelated map, never worse.

3. The "Eigenmode" Trick: Compressing the Noise

The paper also tackles "systematic uncertainties"—these are like known flaws in the camera lens (e.g., the lens might be slightly blurry or shifted).

  • The Old Way: You add a separate knob for every possible way the lens could be wrong, for every single box.
  • The New Way: The authors use a technique called Eigenmode decomposition. Imagine the map has a few "fundamental shapes" (like a wave, a hill, or a dip) that represent the most common ways the data can wiggle due to noise or lens flaws.
  • The Benefit: Instead of adjusting millions of knobs, you only need to adjust a handful of these "fundamental shape" knobs. It's like compressing a huge, high-definition video file into a small MP3; you keep the most important information (the shape of the signal) and throw away the redundant noise. This makes the math much faster and easier to solve.

4. The Trade-off: The "Two-Step" vs. "One-Pass"

The paper is honest about a limitation.

  • The Old Method (Barlow-Beeston): This is like a "joint profile." It looks at the data and the map simultaneously, adjusting the map's wobbles in real-time as it searches for the gem. It is mathematically perfect for finding the gem when data is scarce.
  • The New Method (GP Eigenmode): This is a "two-step" process. First, it builds the smooth map from the simulation. Second, it uses that fixed map to find the gem.
  • The Catch: Because the map is fixed in the first step, it can't adapt perfectly to the specific noise in the final data. The paper shows that if you have very little data (scarce photos), the old method is slightly better at finding the gem because it adapts better. However, if you have lots of data (which is common in modern experiments), the difference is tiny, and the new method's speed and simplicity win out.

Summary of the Paper's Claims

  • What they did: They replaced the standard "pixelated" histogram maps with smooth "Gaussian Process" maps and compressed the uncertainty into a few "eigenmodes" (fundamental shapes).
  • What they proved:
    1. The new smooth maps are mathematically guaranteed to be more precise than the old pixelated maps when data is scarce.
    2. The new method can reduce the number of "wobble knobs" (parameters) from thousands to just a few dozen, making complex 3D analyses possible.
    3. The old method is still the "gold standard" for pure statistical efficiency when data is extremely rare, but the new method is practically superior for modern, complex experiments where systematic errors (like lens flaws) dominate.
  • The Tool: They built this into a free software package called Histimator so other physicists can use it immediately.

In short, the paper offers a way to turn a grainy, shaky, and computationally heavy map into a smooth, stable, and efficient one, allowing physicists to search for new particles in higher dimensions without getting lost in the math.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →