Peak-Based Nuclide Identification in HPGe… — Plain-Language Explanation

Original authors: Samuel Emmons, Kelly Truax, Maurice Lonsway, Bruce Pierson, Brian Archambault

Published 2026-06-16

📖 4 min read☕ Coffee break read

Original authors: Samuel Emmons, Kelly Truax, Maurice Lonsway, Bruce Pierson, Brian Archambault

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). ✨ This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

The Problem: Finding Needles in a Haystack

Imagine you have a giant box of mixed-up Lego bricks. Some are red, some are blue, some are tiny, and some are huge. Your job is to look at a pile of these bricks and tell a friend exactly which specific colors and shapes are in that pile.

In the real world, scientists use special detectors (called HPGe detectors) to look at radioactive samples. These detectors produce a "spectrum," which is like a complex graph of peaks and valleys. Each "peak" represents a specific type of radioactive atom (a nuclide) emitting energy.

The problem is that these spectra are messy. Peaks overlap, some are very faint, and there are hundreds of different types of atoms to look for. Traditionally, human experts have to sit down, look at the graph, fit the peaks perfectly, and use standard software (like Genie 2000) to guess which atoms are present. This is slow, tiring, and sometimes the software gets it wrong, suggesting atoms that aren't actually there (false alarms).

The Solution: Training a Smart Assistant

The authors of this paper wanted to build a "smart assistant" using Machine Learning (ML) to help the experts. Instead of feeding the computer the whole messy graph, they gave it a "shopping list" of the most important peaks that experts had already identified and measured.

They taught two types of AI "students" to look at this list and decide which atoms are present:

XGBoost: Think of this as a team of detectives who ask a series of "Yes/No" questions to narrow down the suspects.
DNN (Deep Neural Networks): Think of this as a super-brain that looks for complex patterns and connections across the whole list at once.

The Results: Who Won the Contest?

The team tested these AI models against the traditional software (Genie 2000) using about 1,600 real-world examples of radioactive samples.

The Score: The AI models were much better at the job. The best AI model (XGBoost) got a score of 0.97 (out of 1.0), while the traditional software only got 0.84.
The Main Victory: The biggest win wasn't just finding the right atoms; it was not finding the wrong ones. The traditional software was like a security guard who yells "Intruder!" every time a shadow moves. The AI models were smarter; they only yelled "Intruder!" when they were actually sure. This means fewer false alarms for the human experts.

The "Why": The Magic Mirror (SHAP)

One of the most important parts of this paper is that the authors didn't just say, "The AI works." They wanted to know how it works. They used a tool called SHAP (which acts like a magic mirror) to see exactly which clues the AI was using to make its decisions.

They found that the AI wasn't just guessing; it was using physics-based logic:

The Main Clue: If the AI thinks "Cadmium-109" is there, it's mostly looking at the specific peak for Cadmium.
The Context Clue: The AI also looks at the "family." For example, if it sees a short-lived atom called Niobium-97, it checks to see if its "parent" atom (Zirconium-97) is also there. If the parent is missing, the AI knows the child probably shouldn't be there either. Traditional software often misses this family connection.
The "Context" Clue: The AI understands that a peak at a certain height might mean one thing if it's alone, but something else if it's surrounded by other specific peaks.

The Limitations: When the AI Gets Confused

The paper admits the AI isn't perfect.

Rare Items: If a specific atom appears very rarely in the training data (like a rare Lego piece that only shows up 3% of the time), the AI sometimes struggles to identify it correctly.
Bad Labels: If the human experts made mistakes when labeling the training data (saying "This is Atom A" when it was actually "Atom B"), the AI gets confused. It learns from the mistakes it was taught.

The Bottom Line

This paper shows that by teaching computers to look at the "shopping list" of radioactive peaks, we can create a tool that is faster and more accurate than current methods. It doesn't replace the human expert; instead, it acts like a highly skilled assistant that filters out the noise and false alarms, letting the human focus on the real work. The AI learned to think like a physicist, using both the main clues and the surrounding context to make the right call.

Technical Summary: Improving Peak-Based Nuclide Identification in HPGe $\gamma$ -Spectrometry with Machine Learning and SHAP

Problem Statement
High-purity germanium (HPGe) gamma spectrometry is essential for analyzing complex radioactive samples, particularly in nuclear forensics. However, identifying and quantifying radionuclides (Nuclide Identification or NID) in these spectra is a time-intensive process requiring expert analysts to manually review and refit photopeaks, often using commercial software. While software assists in NID, analysts frequently must manually amend the suggested list of nuclides, especially when dealing with overlapping peaks or complex isotopic combinations. As the volume of samples requiring analysis increases, the reliance on manual intervention creates a bottleneck for timely and accurate decision-making. The challenge lies in automating this process without sacrificing the physical rigor and expert judgment currently embedded in the workflow.

Methodology
The authors developed and evaluated supervised machine learning (ML) models to map carefully fitted photopeak areas from HPGe spectra to NID results for a library of 65 radioisotopes.

Data Source: The study utilized approximately 1,600 well-labeled experimental HPGe gamma spectra collected at Pacific Northwest National Laboratory (PNNL) during nuclear forensics R&D. These spectra contained over 800 unique isotopic combinations. A holdout set of 123 spectra from recent work was used to test generalizability.
Feature Engineering: Photopeaks were initially located and fitted using commercial software (Apex-Gamma) and subsequently reviewed and refitted by expert analysts. To create ML inputs, peak areas were mapped to a fixed vector of 359 entries corresponding to known gamma emission energies of the 65 isotopes.
- A logarithmic rescaling method (Eqs. 1 and 2) was applied to the peak areas. This approach was found to improve model convergence and computational stability compared to standard min-max scaling or efficiency-based scaling, as it compressed the dynamic range of peak areas while preserving relative relationships.
Model Architectures: Three distinct architectures were trained and optimized using Bayesian optimization:
1. XGBoost (XGB): Extreme Gradient Boosted Decision Trees, selected for their aptitude with tabular data.
2. Multi-label Dense Neural Networks (DNN): Designed to learn patterns and correlations across the multi-label scenario.
3. Binary Relevance (BR) DNNs: An ensemble of binary classifiers, one for each target nuclide.
Evaluation Metrics: Models were assessed using Recall, Precision, and the F1 score (harmonic mean of precision and recall) via 5-fold cross-validation.
Explainability: Shapley Additive Explanations (SHAP) were employed to interpret model predictions. Specifically, Deep SHAP was used for DNNs and Tree SHAP for XGB models to quantify the importance of specific photopeaks in driving predictions.

Key Results

Performance Superiority: The XGBoost model achieved the highest performance, with an F1 score of 0.967 on the test set, outperforming the multi-label DNN (0.947) and the Binary Relevance DNN (0.952).
Comparison to Commercial Software: When compared against Genie 2000 (a standard commercial spectroscopic software) using an identical nuclide library of 65 isotopes, the ML models significantly outperformed the traditional method.
- On the test set, Genie 2000 achieved an F1 score of 0.839, while XGB achieved 0.967.
- On the independent holdout set of 123 spectra, the best ML model achieved an F1 of 0.92, compared to 0.80 for Genie 2000.
- The primary driver of this improvement was a drastic reduction in false positives; the ML models were less prone to over-estimating the presence of nuclides compared to the template-based methods of the commercial software.
Robustness in Specific Scenarios: The ML models demonstrated superior performance in challenging scenarios, such as identifying isotopes with gamma rays close in energy to others (e.g., Cd-109 vs. Pb X-rays) or short-lived daughter nuclides (e.g., Nb-97) where the parent isotope context is critical.
Explainability Findings: SHAP analysis confirmed that the models rely on physically relevant features:
- The strongest photopeak for any isotope consistently held the highest SHAP value.
- Models successfully utilized spectral context, such as the presence of a parent isotope's peak (e.g., Zr-97) to support the identification of a daughter (e.g., Nb-97).
- XGB models tended to rely on fewer, more strongly weighted features, whereas DNNs distributed importance more evenly across features.

Significance and Claims
The paper claims that supervised ML models, specifically XGBoost, can serve as an effective, expert-informed, automated tool to improve the initial set of radionuclides suggested to an analyst. By mapping fitted photopeaks to NID results, these models can reduce the burden on spectroscopists by providing a smaller, yet highly accurate (nearly perfect recall) list of candidates, thereby driving subsequent quantification more effectively.

A distinct contribution of this work is the application of SHAP to explain the importance of specific photopeaks in ML-based NID. The authors assert this is the first publication to use these tools to demonstrate that ML models rely on physically relevant photopeaks and broader spectral contexts (such as parent-daughter relationships) rather than spurious correlations. This interpretability is crucial for building trust in automated systems within the nuclear forensics community.

The study concludes that while ML models excel at NID, their performance is limited by the support (frequency) of specific nuclides in the training data. The authors suggest that future improvements could involve incorporating high-fidelity synthetic spectra to better generalize for rare isotopes, but they maintain that the current approach already offers a significant advancement over traditional software-based NID methods.

Peak-Based Nuclide Identification in HPGe γ\gammaγ-Spectrometry with Machine Learning and SHAP