Implementation of full and simplified likelihoods in CheckMATE

This paper presents the implementation of full and simplified likelihood models for multibin signal regions within the CheckMATE framework, incorporating 13 ATLAS and CMS searches to enable statistical combinations that enhance sensitivity and allow for the integration of orthogonal search channels.

Iñaki Lara, Krzysztof Rolbiecki

Published 2026-03-06
📖 5 min read🧠 Deep dive

Imagine the Large Hadron Collider (LHC) as the world's most powerful particle smasher. Every time it runs, it creates a mountain of data—billions of collisions that look like a chaotic, glittering snowstorm. Physicists are desperately looking for a single, unique snowflake that doesn't belong to the natural world: a sign of "New Physics" (like dark matter or supersymmetry).

For years, the ATLAS and CMS experiments (the two giant detectors at the LHC) would say, "We looked in this specific bin of snow, and we didn't find anything. Here is the limit." But as the data grew, they realized that looking at just one bin was like trying to find a needle in a haystack by only looking at the top inch of the hay. They started organizing the data into complex, multi-layered maps with hundreds of bins, using sophisticated statistics to find patterns.

The problem? These complex maps were locked behind a "black box." If a theorist wanted to test their own crazy new idea against this data, they couldn't easily do it because the experiments only gave simplified summaries.

Enter CheckMATE.

Think of CheckMATE as a universal translator and a high-powered calculator for physicists. It takes the raw data from the LHC experiments and lets theorists test their own theories against it. But until now, CheckMATE was a bit like a calculator that could only do simple addition. It couldn't handle the complex, multi-layered statistical maps the experiments were using.

What this paper does:
The authors (I˜naki Lara and Krzysztof Rolbiecki) have upgraded CheckMATE to handle the full complexity of these modern experiments. They've added two new "modes" of operation:

1. The "Full Likelihood" Mode (The Master Chef)

Imagine the LHC experiments as a master chef who has a secret recipe for a complex stew. The recipe includes not just the ingredients (the data), but also exactly how much salt, pepper, and heat were used, and how they interact with each other.

  • The Upgrade: The authors have managed to get the "secret recipe" (the full statistical model) for 9 ATLAS searches and 4 CMS searches.
  • How it works: They use a tool called Spey and Pyhf (think of these as high-end kitchen appliances) to cook the stew exactly the way the experiments did.
  • The Catch: Cooking this stew is slow. It takes a lot of computing power and time. It's like baking a soufflé; if you want the perfect result, you have to be patient and precise. This method is the most accurate but is too slow if you want to test thousands of different theories quickly.

2. The "Simplified Likelihood" Mode (The Food Processor)

Now, imagine you need to test 1,000 different recipes quickly. You can't bake a soufflé for each one.

  • The Upgrade: The authors created a "simplified" version of the recipe. Instead of tracking every single interaction between ingredients, they approximate the flavor profile using a "correlated background model."
  • How it works: It's like using a food processor. It chops everything up quickly and gives you a very good approximation of the taste. It's much faster (seconds instead of minutes) and allows you to scan through huge numbers of theories.
  • The Trade-off: Sometimes, the approximation might be slightly off. In a few specific cases (like the "compressed mass spectra" search), the simplified version was a bit too aggressive, ruling out theories that might actually be okay. But for most searches, it's a fantastic balance of speed and accuracy.

The "Control Regions" (The Safety Net)

In these experiments, scientists don't just look at where they expect new physics (Signal Regions); they also look at places where they know only standard physics happens (Control Regions).

  • Think of the Signal Region as the "treasure hunt" area and the Control Region as the "calibration zone."
  • The authors implemented a way to use these calibration zones to fine-tune the treasure hunt. In some cases, ignoring the calibration zones made the search too strict (ruling out too many things). By including them, the search becomes more realistic and fair.

Why does this matter?

Before this update, if a theorist had a new, complex idea, they might have to wait weeks to see if it was ruled out, or they might have to use a simplified method that wasn't accurate enough.

Now, with this update:

  1. Speed: They can test ideas in seconds using the "Simplified" mode.
  2. Precision: If an idea looks promising, they can run the "Full" mode to get a definitive answer.
  3. Combination: They can now combine different searches (like mixing different flavors of ice cream) to get a stronger signal. This is like combining the results of looking for a needle in the top of the haystack and the bottom of the haystack to be absolutely sure it's not there.

In a nutshell:
The authors have given physicists a new, super-charged toolkit. They can now take the complex, multi-dimensional maps of the LHC and use them to test their wildest theories about the universe, either with the speed of a sprint (Simplified) or the precision of a marathon runner (Full). This helps the scientific community move faster in the hunt for the next big discovery.