MEDIC: a network for monitoring data quality in collider experiments

Imagine you are the conductor of a massive, high-speed orchestra. This isn't a normal orchestra; it's the Large Hadron Collider (LHC), a machine so complex it's like a city of sensors built underground to smash particles together. Every second, this orchestra plays millions of notes (data points) per second.

The goal? To hear the music of the universe and find new particles. But here's the problem: sometimes, a violin string snaps, a drumstick breaks, or a microphone gets unplugged. If the conductor (the scientists) doesn't notice immediately, the recording is ruined, and they might think a broken drum is a new type of music.

This is where MEDIC comes in.

The Problem: Too Much Noise, Too Many Sensors

In the past, checking if the orchestra was playing correctly was done by human "shifters." These are experts who sit in front of screens, looking at charts and graphs (histograms) to see if the data looks right.

The Issue: The LHC is so big and fast that humans can't keep up. They are like trying to spot a single broken violin string in a stadium full of people playing by listening to a single microphone. It's slow, tiring, and prone to human error.

The Solution: MEDIC (The AI Conductor)

The authors of this paper built a new tool called MEDIC (Monitoring for Event Data Integrity and Consistency). Think of MEDIC as a super-smart, tireless AI conductor that listens to the orchestra in real-time.

Here is how it works, broken down into simple steps:

1. The Training Ground: A "Fake" Orchestra

You can't teach a new AI by letting it listen to the real orchestra immediately, because if it makes a mistake, you lose real data. Instead, the scientists built a virtual simulation.

The Metaphor: Imagine they built a perfect, digital twin of the LHC orchestra. They programmed this digital twin to play perfectly, but then they intentionally broke things. They "turned off" the digital violins in the front row, or "muted" the drums in the back.
The Result: They created a massive library of "perfect" recordings and "broken" recordings. This is the school where MEDIC learns. It learns to recognize that when the digital drums go silent, it's a glitch, not a new song.

2. The Brain: How MEDIC Thinks

MEDIC isn't just a simple rule-finder; it's a Neural Network (a type of AI that mimics the human brain).

The Inputs: MEDIC doesn't look at the whole orchestra at once. Instead, it looks at small, overlapping groups of notes (called "windows"). It picks 30 random instruments (tracks) and 30 random microphones (towers) from each group to keep things fast.
The Magic: It uses a special technique called Transformers (the same tech behind chatbots like me). This allows MEDIC to understand that the order of the instruments doesn't matter, only which ones are playing and how they sound.
The Output: After listening to a small window of time, MEDIC doesn't just say "Good" or "Bad." It gives a probability score: "I'm 90% sure this is a normal run, 5% sure the front drums are broken, and 5% sure the back violins are silent."

3. The Sliding Window: Catching the Glitch

Real life isn't a single snapshot; it's a movie. MEDIC watches the data like a sliding window moving across a film strip.

The Metaphor: Imagine watching a movie through a small square frame. As the movie plays, the frame slides forward. If the frame catches a scene where the lights flicker, MEDIC notes it. If the lights flicker for just one frame, it might be a glitch in the camera. But if the lights flicker for 10 frames in a row, MEDIC raises a red flag: "Something is definitely wrong with the power supply!"
This helps avoid false alarms. It waits for a pattern before shouting "Fire!"

Why This Matters

The paper shows that MEDIC is incredibly accurate.

Speed: It can process data much faster than a human.
Precision: It can tell you exactly which part of the detector is broken (e.g., "The barrel section of the calorimeter is dead"), not just that "something is wrong."
Future-Proof: Because it learns from simulations, if the LHC gets upgraded with new, bigger sensors, the scientists can just update the "fake orchestra" simulation and retrain MEDIC immediately. They don't have to wait for real data to break things first.

The Bottom Line

MEDIC is a safety net made of AI. It allows scientists to stop staring at boring charts and start focusing on the physics. It acts as a tireless guardian, constantly listening to the massive, complex machine, ready to whisper, "Hey, the left-side microphones are acting up," so the humans can fix it before the music is ruined.

This approach represents a shift from "checking the data after the fact" to "automatically guarding the data as it happens," which is essential for the future of high-energy physics.

1. Problem Statement

Data Quality Monitoring (DQM) is critical in high-energy physics (HEP) experiments, such as those at the Large Hadron Collider (LHC), to ensure recorded data is suitable for physics analysis. Current DQM systems face significant challenges:

Scale and Complexity: Detectors generate unprecedented data volumes with complex substructures.
Human Limitations: Traditional DQM relies on human "shifters" reviewing histograms to compare live data against reference datasets. This process is labor-intensive, prone to human error, and struggles to keep pace with real-time data rates.
Detection Granularity: Existing methods often rely on aggregated histogram data, which can obscure subtle, event-level anomalies or fail to pinpoint the specific source of a malfunction (e.g., a specific calorimeter region).
Reference Data Dependency: Training ML models on real data requires pre-certified "good" data, which is not always available immediately after detector upgrades or configuration changes.

2. Methodology

The authors propose MEDIC (Monitoring for Event Data Integrity and Consistency), an end-to-end, simulation-driven Deep Learning framework designed to automate anomaly detection and localization.

A. Simulation-Driven Approach

Instead of relying on real experimental data for training, the authors utilize a modified version of Delphes, a fast, multi-purpose detector simulation framework.

Anomaly Generation: They introduced a feature to simulate specific detector malfunctions by deactivating specific regions (e.g., HCAL barrel, endcap, or forward regions) within the simulation.
Physics Consistency: The underlying physics events (proton-proton collisions at $\sqrt{s}=13$ $s = 13$ TeV) are generated using MadGraph5_aMC@NLO and Pythia8. These events are then passed through Delphes under four configurations:
1. Normal: All components active (Reference).
2. HCAL Barrel Glitch: 5° towers deactivated.
3. HCAL Endcap Glitch: 10° towers deactivated.
4. HCAL Forward Glitch: 20° towers deactivated.
Dataset Construction: A sliding window approach is used to simulate continuous data-taking. Each window ( $W$ ) contains $W$ sequential events. To ensure computational efficiency for online monitoring, only 30 randomly selected tracks (7 features each) and 30 randomly selected towers (8 features each) are retained per event, along with global Missing Transverse Energy (MET).

B. MEDIC Network Architecture

The neural network is designed to handle heterogeneous inputs (tracks, towers, and MET) and temporal sequences.

Input Branches:
- Tracks & Towers: Treated as unordered sets. They are projected via linear layers and processed by Transformer encoders (multi-head self-attention) to ensure permutation invariance. An attention pooling layer aggregates these into fixed-length vectors.
- MET: Processed through a non-linear projection into the same embedding dimension.
Feature Fusion: The three embeddings are stacked into a tensor of shape $[128, 3, W]$ (embedding dimension, 3 branches, window size).
Classification Head: The fused tensor passes through a 2D Convolutional Neural Network (CNN) stack (3 blocks with 64, 128, and 256 channels) to learn temporal correlations across the window. This is followed by global average pooling and a fully connected layer with a Softmax output for 4 classes (Normal, Barrel Glitch, Endcap Glitch, Forward Glitch).

C. Training Strategy

Loss Function: The model is trained to predict a probability distribution over the 4 classes for each window. The Kullback-Leibler (KL) divergence is used as the cost function to minimize the difference between the predicted and target probability vectors.
Metrics: Performance is evaluated using Hard Accuracy (majority class prediction) and a modified Brier Score (measuring calibration of the full probability distribution).
Validation: A 5-fold cross-validation strategy is employed with an ensemble approach (combining predictions from 5 independently trained models) to ensure robustness against statistical fluctuations in the random sampling of tracks and towers.

3. Key Contributions

End-to-End Event-Level DQM: Unlike traditional histogram-based methods, MEDIC operates directly on raw particle-level kinematic inputs, allowing for the detection of subtle inconsistencies invisible in aggregated data.
Simulation-First Paradigm: The framework demonstrates that high-quality DQM models can be developed and validated entirely on simulated data with injected faults. This eliminates the dependency on pre-certified real data, allowing for immediate model adaptation to detector upgrades (crucial for the High-Luminosity LHC).
Anomaly Localization: The network is capable not just of detecting that an anomaly exists, but of classifying the specific source of the malfunction (e.g., distinguishing between a barrel and endcap failure).
Open Framework: The complete pipeline, including modified Delphes configurations and training code, is made publicly available to ensure reproducibility.

4. Results

The model was tested with various window sizes ( $W$ ).

Optimal Window Size: Performance improved as $W$ increased from 10 to 30, after which it saturated. $W=30$ was selected as the optimal balance between performance and computational cost.
Performance Metrics (at $W=30$ ):
- Multi-Class Accuracy: 89.7%
- Binary Accuracy (Normal vs. Anomaly): 90.3%
- AUC (Area Under Curve): 0.963 (Multi-class) and 0.961 (Binary).
- Brier Score: 0.001, indicating excellent probability calibration.
Robustness: The ensemble approach and cross-validation confirmed the model's stability. Even without explicit binary training, the model effectively separated normal from anomalous runs.
Inference Cost: The architecture scales linearly with window size, making it suitable for real-time or "fast" online monitoring.

5. Significance and Future Outlook

Operational Efficiency: MEDIC offers a pathway to reduce the manual load on human shifters by providing automated, real-time alerts with specific fault localization.
Scalability: The modular architecture allows for the addition of new failure modes without retraining the entire network from scratch.
Future Work: While the current study uses Delphes (particle-level), the authors note that future iterations could incorporate Geant4 simulations for electronic-level signals (digitization, timing, noise) to further improve resolution and accuracy.
Paradigm Shift: This work establishes a new paradigm where simulation-driven, ML-based DQM tools are developed in parallel with detector hardware, ensuring readiness for future collider upgrades and high-luminosity operations.

In conclusion, MEDIC represents a significant step toward fully automated, intelligent data quality monitoring in particle physics, leveraging simulation to overcome the limitations of traditional, human-centric DQM systems.

MEDIC: a network for monitoring data quality in collider experiments

The Problem: Too Much Noise, Too Many Sensors

The Solution: MEDIC (The AI Conductor)

1. The Training Ground: A "Fake" Orchestra

2. The Brain: How MEDIC Thinks

3. The Sliding Window: Catching the Glitch

Why This Matters

The Bottom Line

1. Problem Statement

2. Methodology

A. Simulation-Driven Approach

B. MEDIC Network Architecture

C. Training Strategy

3. Key Contributions

4. Results

5. Significance and Future Outlook

More like this

Probing Neutral Triple Gauge Couplings via $ZZ$ Production at e+e−e^+e^-e+e− Colliders with Machine Learning

Multiplicity dependence of prompt and non-prompt J/ψ\psiψ production at midrapidity in pp collisions at s=13\sqrt{s} = 13s​=13 TeV

Recent Neutrino Oscillation and Cross-Section Results from the T2K Experiment

Search for the lepton-flavour violating decays B+→π+μ±e∓B^+ \to \pi^+ \mu^\pm e^\mpB+→π+μ±e∓

Long-term stability study of single-mask triple GEM detector: impact of continuous irradiation

Probing Neutral Triple Gauge Couplings via $ZZ$ Production at $e^+e^-$ Colliders with Machine Learning

Multiplicity dependence of prompt and non-prompt J/ $\psi$ production at midrapidity in pp collisions at $\sqrt{s} = 13$ TeV

Search for the lepton-flavour violating decays $B^+ \to \pi^+ \mu^\pm e^\mp$