Transforming jet flavour tagging at ATLAS

Imagine the Large Hadron Collider (LHC) as the world's most powerful particle smasher. When it fires protons at each other, they explode into thousands of smaller particles, creating a chaotic storm. Among this storm, physicists are looking for specific "flavors" of particles—specifically those made from heavy quarks (like bottom and charm quarks)—because these are the keys to understanding the Higgs boson and searching for new physics.

The problem is that these heavy particles don't come in neat, labeled boxes. Instead, they turn into "jets"—sprays of smaller particles that look very similar to the sprays created by common, light particles. It's like trying to find a specific type of rare fruit in a giant pile of mixed fruit salad where everything looks like a blur of red and green.

The Old Way: The Two-Step Detective

For years, the ATLAS experiment used a "two-step" detective method to sort these jets.

Step 1: Specialized tools would look at individual clues (like the tracks left by particles) to find specific signs, such as a "secondary vertex" (a spot where a heavy particle decayed a tiny bit away from the main crash site).
Step 2: A computer brain would take all those clues and make a final guess: "Is this a heavy-flavor jet or a light one?"

This worked well, but it was like a detective who first asks a specialist to check the fingerprints, then asks another to check the shoe prints, and finally asks a third person to combine the reports. It was effective, but it relied on humans manually designing the rules for each specialist.

The New Way: GN2, the "Transformer" Detective

This paper introduces GN2, a new algorithm that changes the game. Instead of the two-step process, GN2 is an end-to-end system. Think of it as a single, super-smart detective who looks at the entire crime scene at once, without needing to break it down into separate tasks first.

GN2 uses a technology called a Transformer (the same type of AI architecture that powers modern language models). Here is how it works in simple terms:

Reading the Whole Story: Instead of looking at clues one by one, GN2 looks at the jet and all the particles inside it simultaneously. It understands how the particles relate to each other, much like how you understand a sentence by reading the whole sentence, not just word-by-word.
Physics-Informed Training: To make sure the AI doesn't just memorize the data but actually understands physics, the scientists gave it extra homework. They asked it to do two side tasks:
1. Track Origin: "Where did this specific particle come from?" (Did it come from the main crash, or did it come from a heavy particle decaying?)
2. Vertex Grouping: "Which particles belong to the same group?" (Can you find the cluster of particles that came from the same decay point?)
By forcing the AI to learn these physical concepts, it becomes better at the main job: identifying the jet's flavor. It's like teaching a student not just to pass a test, but to understand the underlying math so they can solve any problem.

The Results: A Massive Leap Forward

The paper compares GN2 to the previous best algorithm (called DL1d). The results are dramatic:

Better at Filtering: If you want to catch 70% of the heavy "bottom" jets, GN2 is 3.5 times better at ignoring the fake "charm" jets and 1.8 times better at ignoring the common "light" jets compared to the old method.
Real-World Proof: They didn't just test this on computer simulations; they tested it on real data from the LHC. The improvement held up, proving the AI works in the messy, real world.
Versatility: Because GN2 learns the physics directly, it can easily be retrained to spot other things, like "tau" particles (a type of heavy electron), without needing to rebuild the whole system from scratch.

Why It Matters

This isn't just a small upgrade; it's a fundamental shift in how particle physics experiments use machine learning. By moving from a "hand-crafted" two-step process to a "learned" end-to-end system, ATLAS has significantly sharpened its tools.

This improvement is crucial for future discoveries. For example, it will help scientists measure how the Higgs boson interacts with charm quarks and search for the production of Higgs boson pairs. The paper suggests these improvements could boost the sensitivity of these future measurements by up to 30%.

In short, GN2 is a smarter, more flexible, and more powerful way to find the "needles" (heavy quarks) in the "haystack" (particle collisions), allowing physicists to see deeper into the secrets of the universe.

Technical Summary: Transforming Jet Flavour Tagging at ATLAS

Problem Statement
Jet flavour tagging is a critical component of the ATLAS physics programme at the Large Hadron Collider (LHC), enabling the identification of jets originating from heavy-flavour quarks ( $b$ and $c$ ), hadronic $\tau$ -lepton decays, and light quarks or gluons. Traditional ATLAS flavour-tagging algorithms, such as the state-of-the-art DL1d, rely on a two-stage approach: specialized low-level algorithms extract information from charged-particle tracks (e.g., reconstructing displaced vertices), and a high-level multivariate classifier combines these outputs. While effective, this paradigm relies on manually optimized low-level steps and may not fully exploit the correlations within the low-level tracking data. Furthermore, the increasing complexity of physics analyses, such as measurements of Higgs boson pair production and $c$ -quark Yukawa couplings, demands algorithms with higher rejection capabilities for background jets ( $c$ , light, and $\tau$ ) while maintaining high signal efficiency.

Methodology
This paper introduces GN2, a novel flavour-tagging algorithm that departs from the traditional two-stage paradigm by employing an end-to-end transformer-based architecture. Unlike previous approaches that process pre-processed low-level features, GN2 directly ingests raw low-level tracking information (tracks) and jet kinematic properties.

Architecture: The core of GN2 is a Transformer encoder. Jet features are concatenated with a fixed-size array of track feature vectors (up to 40 tracks per jet). These combined vectors are processed by a per-track initialisation network, followed by a four-layer Transformer encoder with eight attention heads. This allows the model to learn relationships between tracks within a jet, effectively capturing the complex topology of heavy-flavour decays.
Physics-Informed Auxiliary Objectives: To enhance interpretability and performance, GN2 incorporates two auxiliary training objectives alongside the primary jet classification task:
1. Track Origin Prediction: The network predicts the physical origin of each track (e.g., primary interaction, $b$ -hadron decay, $c$ -hadron decay, $\tau$ -decay, or pile-up).
2. Vertex Grouping: The network determines which pairs of tracks originate from a common vertex, enabling the reconstruction of secondary vertices without explicit vertex-finding algorithms.
  These objectives are embedded in a combined loss function, allowing simultaneous optimization.
Training and Deployment: The model is trained on a mixture of simulated $t\bar{t}$ and $Z'$ events at $\sqrt{s}=13$ TeV and $13.6$ TeV. A 4-fold cross-validation strategy is employed to prevent data leakage. The algorithm is deployed using OnnxRuntime within the ATLAS software framework.

Key Contributions

End-to-End Learning: GN2 represents a shift from feature-engineered, two-stage algorithms to a unified, end-to-end deep learning model that processes low-level track data directly.
Transformer Application: It is the first deployment of a Transformer model for jet flavour tagging at ATLAS, replacing the Graph Neural Network (GNN) used in the demonstrator GN1.
Interpretability via Auxiliary Tasks: By explicitly training the network to reconstruct vertex structures and track origins, the authors demonstrate that physics-informed constraints improve the main classification task and provide a mechanism to interpret the model's internal representations.
Unified $\tau$ -tagging: Unlike DL1d, GN2 includes a dedicated output node for hadronic $\tau$ -lepton decays, allowing for simultaneous $b$ , $c$ , $\tau$ , and light-jet classification.

Results
The performance of GN2 is validated in both Monte Carlo simulation and collision data from Run 2 ( $\sqrt{s}=13$ TeV) and Run 3 ( $\sqrt{s}=13.6$ TeV).

Simulation Performance: In $t\bar{t}$ events, for a standard 70% $b$ -jet tagging efficiency, GN2 improves the rejection of $c$ -jets by a factor of 3 and light-jets by a factor of 1.6 compared to DL1d. In high- $p_T$ $Z'$ events, the improvements are even more pronounced, with $c$ -jet rejection improving by a factor of 3 and light-jet rejection by a factor of 4. The inclusion of the $\tau$ -output node yields a $\tau$ -jet rejection improvement of up to a factor of 8–9.
Data Performance: Using 140 fb $^{-1}$ of Run 2 collision data, the calibrated performance confirms the simulation results. For a 70% $b$ -jet tagging efficiency, the measured $c$ -jet rejection in data improves by a factor of 3.5, and light-jet rejection by a factor of 1.8, relative to DL1d.
Robustness: The algorithm shows minimal dependence on the choice of Monte Carlo event generator (e.g., Powheg Box, Herwig, Sherpa), with performance ratios between alternative generators and the nominal setup remaining within 1–2% for $b$ -jets and within 10% for $c$ -jets.
Auxiliary Task Performance: The track origin classification achieves 84% efficiency and purity for heavy-flavour tracks. The vertex finding capability successfully reconstructs secondary vertices with a transverse displacement distribution and mass consistent with truth-level references, despite not being explicitly trained on vertex mass.

Significance
The paper claims that GN2 provides substantial benefits for physics analyses involving heavy-flavour jets. Specifically, the improved rejection capabilities are projected to enhance the sensitivity of flagship analyses, such as the search for Higgs pair production and the measurement of the $c$ -quark Yukawa coupling, by up to 30% at the High-Luminosity LHC. The work demonstrates the successful integration of advanced machine learning methods (Transformers) and physics-informed auxiliary objectives into experimental particle physics, offering a flexible framework that can be rapidly re-tuned for alternative experimental conditions or physics goals. The authors emphasize that the auxiliary objectives not only boost performance but also open new avenues for interpretability and future applications in jet substructure and reconstruction.

The Old Way: The Two-Step Detective

The New Way: GN2, the "Transformer" Detective

The Results: A Massive Leap Forward

Why It Matters

More like this