E-PCN: Jet Tagging with Explainable Particle Chebyshev… — Plain-Language Explanation

Original authors: Md Raqibul Islam, Adrita Khan, Mir Sazzat Hossain, Choudhury Ben Yamin Siddiqui, Md. Zakir Hossan, Tanjib Khan, M. Arshad Momen, Amin Ahsan Ali, AKM Mahbubur Rahman

Published 2026-05-05

📖 5 min read🧠 Deep dive

View on arXiv ↗PDF ↗

CC BY 4.0

Original authors: Md Raqibul Islam, Adrita Khan, Mir Sazzat Hossain, Choudhury Ben Yamin Siddiqui, Md. Zakir Hossan, Tanjib Khan, M. Arshad Momen, Amin Ahsan Ali, AKM Mahbubur Rahman

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). ✨ This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine a high-energy particle collider, like the Large Hadron Collider (LHC), as a massive, high-speed car crash. When two protons smash together, they don't just break into two pieces; they shatter into a chaotic spray of hundreds of smaller particles. Physicists call these sprays "jets."

The challenge is that these jets are the "fingerprint" of the original particle that caused the crash. Did the crash come from a Higgs boson? A top quark? Or just a boring, common particle? Identifying the source is like trying to figure out what kind of car crashed just by looking at the scattered debris.

For years, scientists have used Artificial Intelligence (AI) to sort this debris. But there's a problem: the best AI models are often "black boxes." They get the answer right, but they can't explain why. It's like a student who gets a perfect score on a math test but refuses to show their work. In science, knowing why is just as important as getting the right answer.

This paper introduces a new AI model called E-PCN (Explainable Particle Chebyshev Network). Think of it as a detective that not only solves the case but also writes a detailed report explaining exactly which clues led to the conclusion.

The Problem with Old AI

Previous AI models treated the particle spray like a giant, messy pile of data. They looked at the whole picture at once. While they were good at guessing the particle type, they often relied on accidental patterns or "glitches" in the computer simulation rather than the actual laws of physics. It was like a detective guessing the culprit based on the color of their shoes rather than the fingerprint.

The New Solution: E-PCN

The authors built E-PCN with a specific philosophy: Let's teach the AI the rules of physics first.

Instead of just dumping all the data into a black box, they broke the particle spray down into four specific "lenses" or "views," based on how particles actually behave in the universe (a concept called the Lund Jet Plane). Imagine looking at a crime scene through four different colored glasses:

The Distance Glass (Angular Separation, $\Delta$ ): How far apart are the particles?
The Speed Glass (Relative Transverse Momentum, $k_T$ ): How fast are they moving sideways?
The Share Glass (Momentum Fraction, $z$ ): How much of the original energy did each piece take?
The Weight Glass (Invariant Mass, $m^2$ ): How heavy is the combined group of particles?

The E-PCN model has four parallel "brains" (neural networks). Each brain looks at the jet through only one of these four glasses.

Brain #1 only cares about distance.
Brain #2 only cares about speed.
Brain #3 only cares about energy sharing.
Brain #4 only cares about mass.

After each brain makes its own observation, they all meet at a "conference table" (a classification layer) to combine their notes and decide what the particle was.

The "Aha!" Moment: Explainability

Because the model is built this way, the researchers can ask: "Which brain was the most important for this decision?"

They used a technique called Grad-CAM (think of it as a heat map that highlights the most important clues). The results were fascinating and matched what physicists have known for decades:

Distance and Speed were the stars of the show. Together, they made up about 76% of the decision-making power.
Energy Sharing and Mass made up the remaining 24%.

This proves the AI isn't just memorizing random patterns; it has learned the actual "grammar" of the universe. It realized that the way particles spread out (distance) and move (speed) are the most critical clues, exactly as predicted by the laws of Quantum Chromodynamics (QCD).

Does it work better?

Yes. When tested on a massive dataset of simulated particle collisions (JetClass):

It was more accurate than previous top-tier models.
It was much better at spotting rare, heavy particles (like the Higgs boson decaying into bottom quarks), improving the ability to find them by over 80% compared to the old baseline.

The Real-World Test: The "Real Data" Challenge

Simulations are perfect, but real life is messy. Real detectors have noise, and particles get lost. To test if E-PCN was truly "smart" or just "good at simulations," the researchers tested it on real data from the CMS experiment at the LHC (called the Aspen Open Jets dataset).

Since they didn't have the "answer key" for the real data, they checked how well the AI could group similar jets together (clustering).

The old model (PCN) produced a messy, jumbled pile of groups.
The new model (E-PCN) produced neat, distinct, well-separated groups.

This suggests that E-PCN learned the true physics of how particles behave, allowing it to work even when the data is noisy and imperfect, just like a real detective working a messy crime scene.

Summary

In short, the authors built a smarter AI for particle physics by giving it a "physics-first" architecture. Instead of letting the AI guess blindly, they gave it four specific tools to measure the universe. The result is a model that is not only more accurate but also honest about how it thinks, confirming that it relies on the fundamental laws of nature rather than computer glitches.

Technical Summary: E-PCN: Jet Tagging with Explainable Particle Chebyshev Networks Using Kinematic Features

Problem Statement
High-energy collider experiments, particularly with the upcoming High-Luminosity Large Hadron Collider (HL-LHC), face significant challenges in processing vast data volumes to identify and classify jets (collimated sprays of particles). While Graph Neural Networks (GNNs) like the Particle Chebyshev Network (PCN) have improved jet classification performance by treating jets as graphs, they often function as "black boxes." This lack of interpretability hinders the validation of model behavior against physical principles, raising concerns that models may learn spurious correlations or detector artifacts rather than genuine Quantum Chromodynamics (QCD) phenomena. There is a critical need for architectures that not only achieve state-of-the-art accuracy but also provide transparent, physically motivated decision-making processes.

Methodology
The authors propose the Explainable Particle Chebyshev Network (E-PCN), an extension of the PCN that explicitly integrates kinematic variables derived from the Lund jet plane formalism into the graph structure.

Multi-Graph Architecture: Instead of concatenating kinematic features into node attributes, E-PCN constructs four parallel graph representations for each jet. Each graph shares the same node features (16-dimensional particle properties) and connectivity (k-nearest neighbors based on angular separation) but utilizes a distinct kinematic variable as the edge weight:
1. Angular separation ( $\Delta$ ): Encodes angular ordering and collinear emissions.
2. Relative transverse momentum ( $k_T$ ): Sets the scale for the strong coupling constant and separates perturbative from non-perturbative regimes.
3. Momentum fraction ( $z$ ): Quantifies energy sharing between daughter partons via DGLAP splitting functions.
4. Invariant mass squared ( $m^2$ ): Provides sensitivity to heavy-flavor jet identification.
  The first three variables are motivated by the Lund plane factorization of QCD emission probabilities; the fourth complements them for heavy-flavor sensitivity.
Network Architecture: Each of the four graph branches is processed by an identical, independently parameterized feature extractor. This extractor employs a hybrid convolutional approach, alternating between Chebyshev Graph Convolutions (ChebConv) to capture local geometric structures and Edge Convolutions (EdgeConv) to model pairwise particle relationships. The resulting four 64-dimensional jet embeddings are stacked and combined via a $1\times1$ convolutional layer before passing through fully connected layers for classification.
Explainability Mechanism: The authors adapt Gradient-weighted Class Activation Mapping (Grad-CAM) to this multi-graph setting. By computing the gradient of the class score with respect to the embeddings of each specific graph branch, they quantify the relative importance of each kinematic variable in the classification decision.

Key Contributions

Physics-Informed Multi-Graph Design: E-PCN introduces a novel architecture that processes complementary aspects of QCD jet dynamics (geometric structure, radiative scales, splitting probabilities, and mass thresholds) simultaneously through dedicated graph channels, rather than treating them as a monolithic feature set.
Quantitative Explainability: The work demonstrates how Grad-CAM can be applied to multi-graph GNNs to reveal a physically interpretable hierarchy of feature importance. The analysis confirms that the network prioritizes variables consistent with perturbative QCD factorization.
Generalization to Real Data: Unlike many benchmarks restricted to simulation, the authors evaluate the model's representation quality on the Aspen Open Jets dataset, comprising real CMS collision data with detector effects and pileup. They employ unsupervised DeepCluster training to assess clustering structure in the absence of ground-truth labels.

Results
Evaluated on the JetClass benchmark (9 signal classes and 1 background):

Classification Performance: E-PCN achieves a macro-accuracy of 94.67%, a macro-AUC of 96.78%, and a macro-AUPR of 82.41%. These represent relative improvements of 2.36%, 4.13%, and 24.88% over the baseline PCN, respectively. Notably, the AUPR for heavy-flavor channels ( $H \to b\bar{b}$ ) improved by 81.53%.
Explainability Analysis: Grad-CAM reveals that angular separation ( $\Delta$ ) and relative transverse momentum ( $k_T$ ) collectively account for approximately 76% of classification decisions (40.72% and 35.67%, respectively). This hierarchy aligns with the soft-collinear factorization structure of QCD. Class-specific variations were observed, such as elevated $k_T$ importance for gluon jets and increased $m^2$ importance for bottom-quark jets, consistent with Casimir scaling and the dead-cone effect.
Real Data Generalization: On the Aspen Open Jets dataset, E-PCN produced significantly more structured latent representations than PCN. The Davies-Bouldin Index decreased by 52.15% (0.8395 $\to$ 0.4017), and the Dunn Index increased by 42.33% (0.0189 $\to$ 0.0269), indicating superior cluster compactness and separation.

Significance and Claims
The paper claims that E-PCN successfully bridges the gap between high-performance deep learning and physical interpretability in jet tagging. By hardcoding Lund plane kinematic variables into the graph structure, the model learns representations that reflect the underlying QCD radiation patterns rather than simulation artifacts. The authors emphasize that while the feature importance hierarchy matches theoretical QCD predictions, this serves as a validation that the architecture effectively exploits the structure present in the training data.

Crucially, the improved clustering performance on real CMS data suggests that these physics-informed representations are robust enough to generalize beyond idealized simulations to experimental conditions involving detector effects and pileup. The work concludes that building neural networks around established kinematic principles enhances both interpretability and classification performance, offering a promising direction for jet tagging in future high-luminosity collider environments. The authors note that definitive validation of these interpretability claims under full experimental systematic uncertainties remains a subject for future work.

E-PCN: Jet Tagging with Explainable Particle Chebyshev Networks Using Kinematic Features