Implementation of full and simplified likelihoods in CheckMATE

Imagine the Large Hadron Collider (LHC) as the world's most powerful particle smasher. Every time it runs, it creates a mountain of data—billions of collisions that look like a chaotic, glittering snowstorm. Physicists are desperately looking for a single, unique snowflake that doesn't belong to the natural world: a sign of "New Physics" (like dark matter or supersymmetry).

For years, the ATLAS and CMS experiments (the two giant detectors at the LHC) would say, "We looked in this specific bin of snow, and we didn't find anything. Here is the limit." But as the data grew, they realized that looking at just one bin was like trying to find a needle in a haystack by only looking at the top inch of the hay. They started organizing the data into complex, multi-layered maps with hundreds of bins, using sophisticated statistics to find patterns.

The problem? These complex maps were locked behind a "black box." If a theorist wanted to test their own crazy new idea against this data, they couldn't easily do it because the experiments only gave simplified summaries.

Enter CheckMATE.

Think of CheckMATE as a universal translator and a high-powered calculator for physicists. It takes the raw data from the LHC experiments and lets theorists test their own theories against it. But until now, CheckMATE was a bit like a calculator that could only do simple addition. It couldn't handle the complex, multi-layered statistical maps the experiments were using.

What this paper does:
The authors (I˜naki Lara and Krzysztof Rolbiecki) have upgraded CheckMATE to handle the full complexity of these modern experiments. They've added two new "modes" of operation:

1. The "Full Likelihood" Mode (The Master Chef)

Imagine the LHC experiments as a master chef who has a secret recipe for a complex stew. The recipe includes not just the ingredients (the data), but also exactly how much salt, pepper, and heat were used, and how they interact with each other.

The Upgrade: The authors have managed to get the "secret recipe" (the full statistical model) for 9 ATLAS searches and 4 CMS searches.
How it works: They use a tool called Spey and Pyhf (think of these as high-end kitchen appliances) to cook the stew exactly the way the experiments did.
The Catch: Cooking this stew is slow. It takes a lot of computing power and time. It's like baking a soufflé; if you want the perfect result, you have to be patient and precise. This method is the most accurate but is too slow if you want to test thousands of different theories quickly.

2. The "Simplified Likelihood" Mode (The Food Processor)

Now, imagine you need to test 1,000 different recipes quickly. You can't bake a soufflé for each one.

The Upgrade: The authors created a "simplified" version of the recipe. Instead of tracking every single interaction between ingredients, they approximate the flavor profile using a "correlated background model."
How it works: It's like using a food processor. It chops everything up quickly and gives you a very good approximation of the taste. It's much faster (seconds instead of minutes) and allows you to scan through huge numbers of theories.
The Trade-off: Sometimes, the approximation might be slightly off. In a few specific cases (like the "compressed mass spectra" search), the simplified version was a bit too aggressive, ruling out theories that might actually be okay. But for most searches, it's a fantastic balance of speed and accuracy.

The "Control Regions" (The Safety Net)

In these experiments, scientists don't just look at where they expect new physics (Signal Regions); they also look at places where they know only standard physics happens (Control Regions).

Think of the Signal Region as the "treasure hunt" area and the Control Region as the "calibration zone."
The authors implemented a way to use these calibration zones to fine-tune the treasure hunt. In some cases, ignoring the calibration zones made the search too strict (ruling out too many things). By including them, the search becomes more realistic and fair.

Why does this matter?

Before this update, if a theorist had a new, complex idea, they might have to wait weeks to see if it was ruled out, or they might have to use a simplified method that wasn't accurate enough.

Now, with this update:

Speed: They can test ideas in seconds using the "Simplified" mode.
Precision: If an idea looks promising, they can run the "Full" mode to get a definitive answer.
Combination: They can now combine different searches (like mixing different flavors of ice cream) to get a stronger signal. This is like combining the results of looking for a needle in the top of the haystack and the bottom of the haystack to be absolutely sure it's not there.

In a nutshell:
The authors have given physicists a new, super-charged toolkit. They can now take the complex, multi-dimensional maps of the LHC and use them to test their wildest theories about the universe, either with the speed of a sprint (Simplified) or the precision of a marathon runner (Full). This helps the scientific community move faster in the hunt for the next big discovery.

Here is a detailed technical summary of the paper "Implementation of full and simplified likelihoods in CheckMATE" by I˜naki Lara and Krzysztof Rolbiecki.

1. Problem Statement

The Large Hadron Collider (LHC) Run 3 is generating vast amounts of data, yet no definitive signs of new physics have been observed. While the Standard Model is complete, interpreting this data within arbitrary New Physics (NP) models remains challenging.

Limitation of Current Methods: Historically, LHC results were often presented as limits on "simplified models" or single signal regions (SRs) optimized for specific scenarios. This approach fails to capture the full statistical power of modern analyses, which utilize complex, multi-bin data across signal and control regions (CRs).
Reinterpretation Challenge: Reinterpreting these sophisticated analyses for arbitrary NP models requires access to the full statistical machinery (likelihood functions) used by the experiments (ATLAS and CMS). Without this, reinterpretation tools often rely on "best-SR" (Best Signal Region) methods, which discard information from other bins and correlations, leading to weaker exclusion limits and potential misinterpretation of model viability.
Need for Automation: There is a critical need for automated frameworks that can implement both full likelihoods (using the complete statistical model with nuisance parameters) and simplified likelihoods (approximations using covariance matrices) to allow for rigorous, multi-channel statistical combinations.

2. Methodology

The authors present a major update to CheckMATE, a framework for confronting New Physics models with LHC data. The implementation focuses on integrating statistical inference directly into the recasting workflow.

A. Statistical Frameworks Implemented

Full Likelihood (ATLAS):
- Utilizes JSON files released by the ATLAS Collaboration containing the full statistical model (background rates, uncertainties, and nuisance parameters for all signal and control regions).
- Implemented using the Pyhf (pure-Python HistFactory) package.
- Requires mapping CheckMATE signal region names to ATLAS conventions via a configuration file (pyhf conf.json).
- Control Regions (CRs): For some searches (e.g., atlas 2010 14293), the implementation includes control regions to fully exploit the likelihood, though this is not yet standard for all ATLAS searches due to the assumption that signal contribution to CRs is negligible in many models.
Simplified Likelihood (ATLAS & CMS):
- ATLAS: Uses a simplified approach where the background is approximated by the post-fit rate from the full model, constrained by a single nuisance parameter representing the total uncertainty.
- CMS: Uses the correlated background model defined by CMS, which models background correlations via a multivariate Gaussian distribution using a provided covariance matrix.
- Both simplified methods are evaluated using the Spey package (a statistical inference tool) and Pyhf.

B. Technical Implementation & Performance

Backends: The calculation supports multiple Python backends for the Pyhf/Spey setup: Numpy, Tensorflow, Pytorch, and Jax.
- Performance: Benchmarks show Jax and Pytorch offer the fastest computation times (up to 20x faster than Numpy), with Jax showing significant speedups on GPU hardware.
User Control: The framework provides command-line switches to control the calculation mode:
- scan: Fast calculation of observed CLs for quick exclusion checks.
- detailed: Full calculation of observed/expected upper limits and CLs.
- Model: Switch between full (full likelihood), simple (simplified likelihood), and fullpyhf (full likelihood via Pyhf directly).
- Statonly: Allows statistical combination of previously generated event samples without re-running the event generation.

3. Key Contributions

Expanded Database: Implementation of 13 new searches (9 ATLAS, 4 CMS) based on the full Run 2 dataset ( $\sqrt{s}=13$ TeV, $L=139$ fb $^{-1}$ ).
Statistical Combination: The ability to combine orthogonal signal regions and even different searches (planned for future releases) within a single likelihood function, significantly increasing sensitivity compared to single-bin approaches.
Validation Suite: Comprehensive validation of the implementation against official ATLAS and CMS results, including:
- Comparison of Full Likelihood vs. Simplified Likelihood vs. Best-SR methods.
- Detailed event yield comparisons in Appendix tables.
Performance Optimization: Demonstration that modern backends (Jax/Pytorch) make full likelihood evaluations feasible for parameter scans, reducing calculation time from hours to seconds for specific benchmarks.

4. Results

The paper validates the implementation across various Supersymmetry (SUSY) and Dark Matter (DM) scenarios:

Agreement with Official Results:
- In most cases (e.g., atlas 1908 03122, cms 1908 04722), the CheckMATE full and simplified likelihoods show excellent agreement with official ATLAS/CMS exclusion contours.
- Simplified vs. Full: For many searches, the simplified likelihood performs comparably to the full likelihood. However, in complex scenarios with significant shape variations (e.g., atlas 1911 12606 with compressed spectra), the simplified model can struggle to reproduce the exact shape of the observed limit due to local excesses/deficits, whereas the full model captures these nuances.
Sensitivity Gains:
- Multi-bin vs. Best-SR: The use of multi-bin likelihoods consistently provides stronger exclusion limits than the "Best Signal Region" (BSR) method. In several cases (e.g., atlas 2006 05880), the sensitivity improvement is a factor of 2.
- Control Regions: Including control regions (as done in atlas 1911 06660) prevents over-constraining of the parameter space, bringing the results closer to the official experimental limits.
Specific Search Highlights:
- atlas 2101 01629 (Gluinos): Full likelihood reproduces the complex exclusion shape; simplified likelihood is less accurate here.
- cms 1909 03460 (MT2): Successfully combined 282 bins into two multi-bin regions to handle numerical stability, showing clear improvement over BSR.
- atlas 2111 08372 (Invisible Higgs): The shape fit in the high transverse mass region increases sensitivity by a factor of $\sim 2$ compared to single-bin approaches.

5. Significance

Rigorous Reinterpretation: This work elevates the standard for LHC reinterpretation. By moving beyond single-bin "best-SR" approximations to full multi-bin likelihoods, CheckMATE allows theorists to test New Physics models with the same statistical rigor used by the experimental collaborations.
Efficiency: The integration of high-performance computing backends (Jax, Pytorch) makes the computationally expensive full likelihood calculations practical for scanning large parameter spaces, a task previously considered too slow for routine use.
Future-Proofing: The framework is designed to handle the increasing complexity of LHC analyses, including the integration of machine learning methods and the statistical combination of orthogonal channels across different experiments.
Community Resource: By providing a validated, open-source tool with detailed documentation and validation plots, the authors enable the broader physics community to more accurately constrain New Physics models using the full power of Run 2 data.

In conclusion, this paper represents a significant step forward in the automation and precision of LHC data reinterpretation, bridging the gap between complex experimental statistical models and theoretical model testing.

Implementation of full and simplified likelihoods in CheckMATE

1. The "Full Likelihood" Mode (The Master Chef)

2. The "Simplified Likelihood" Mode (The Food Processor)

The "Control Regions" (The Safety Net)

Why does this matter?

1. Problem Statement

2. Methodology

A. Statistical Frameworks Implemented

B. Technical Implementation & Performance

3. Key Contributions

4. Results

5. Significance

More like this

Simulation-Based Inference for Direction Reconstruction of Ultra-High-Energy Cosmic Rays with Radio Arrays

Heavy quarkonium decay V→gggV \to gggV→ggg with both relativistic and QCD radiative corrections

Charged Higgs Boson Phenomenology in the Dark Z mediated Fermionic Dark Matter Model

Strongly electroweak phase transition with U(1)Lμ−LτU(1)_{L_μ-L_τ}U(1)Lμ​−Lτ​​ gauged non-zero hypercharge triplet

Accelerating multijet-merged event generation with neural network matrix element surrogates

Heavy quarkonium decay $V \to ggg$ with both relativistic and QCD radiative corrections

Strongly electroweak phase transition with $U(1)_{L_μ-L_τ}$ gauged non-zero hypercharge triplet