Photometric Redshift PDFs via Neural Network… — Plain-Language Explanation

The Big Picture: Guessing How Far Away Stars Are

Imagine you are looking at a crowd of people from a great distance. You can see their clothes and how bright they are, but you can't see their faces clearly. You want to know how far away each person is.

In astronomy, this is the problem of photometric redshift. Astronomers take pictures of billions of galaxies using different colored filters (like taking photos through red, blue, and green glasses). They want to know how far away each galaxy is based only on these colors and brightness levels.

The problem is that a galaxy can look "red" because it is very far away (its light has stretched), or because it is actually close but just happens to be a red, dusty galaxy. This is called a "degeneracy"—two different things looking the same.

The New Tool: A "Smart Sorter" Instead of a "Calculator"

Traditionally, computers tried to guess the exact distance of a galaxy, like a calculator giving you a single number (e.g., "500 million light-years"). But if the computer is wrong, it doesn't tell you how wrong it might be.

The authors of this paper built a new method called Neural Network Classification (NNC). Instead of acting like a calculator, their computer acts like a smart sorter.

The Bins: Imagine a long shelf with 400 small boxes lined up, representing different distances (redshifts).
The Job: Instead of picking one box, the computer looks at a galaxy and says, "I think there is a 60% chance it belongs in Box 100, a 30% chance in Box 101, and a 10% chance in Box 99."
The Result: This gives a Probability Density Function (PDF). It's like a weather forecast that says, "There's a 60% chance of rain, 30% chance of clouds, 10% chance of sun," rather than just saying "It will rain." This tells astronomers not just the best guess, but how confident they should be.

The Secret Sauce: A Better Training Class

To teach this computer, you need a "training class" of galaxies where we already know the exact distance (measured by powerful spectroscopes).

The Old Class: Before this paper, the training class was mostly made of galaxies from the SDSS survey. It was like a class full of elementary school students. It was great for teaching about nearby things, but it had very few "high schoolers" (distant galaxies).
The New Class: The authors used data from DESI DR1, a massive new survey. This added millions of new "high schoolers" to the training class.
The Result: Because the computer was trained on a much wider variety of galaxies (including very distant ones), it became much better at guessing distances for the whole universe, especially for things far away.

The Two Surveys: Deep vs. Wide

The team tested their method on two different "cameras":

LSDR10 (The Deep Camera): This camera takes very sharp, deep pictures of a specific area. It sees faint, distant objects clearly.
- Result: The computer was incredibly accurate here. It was like using a high-end microscope.
Pan-STARRS (The Wide Camera): This camera sees a much larger area of the sky, but the pictures are a bit shallower (less detailed).
- The Fix: To help the computer with the Wide Camera, the authors added infrared data (heat signatures) from the unWISE survey.
- The Analogy: Imagine trying to identify a fruit by color alone. A red apple and a red tomato look the same. But if you can also feel the temperature (infrared), you can tell them apart. Adding this "heat" data helped the computer distinguish between different types of galaxies much better, reducing errors by about 22%.

Why This Matters

The paper shows that this new "Smart Sorter" method is better than older methods (like Random Forests or standard neural networks) for two main reasons:

It handles confusion: When a galaxy looks like two different things at once (a common problem), the computer doesn't just guess one wrong answer. It shows a "double peak" in its probability, telling the astronomer, "It could be here OR there, I'm not sure."
It knows its limits: The computer is very good at telling you when it is confident and when it is guessing.

The Final Product: A Unified Map

The authors didn't just write a paper; they built a massive catalog. They combined the data from both cameras into one giant map of over 550 million galaxies.

They used a "hierarchical strategy" (a priority list):

If a galaxy is in the "Deep Camera" area, they use the most detailed model.
If it's only in the "Wide Camera" area, they use the model with the infrared help.
If it's in both, they pick the best one.

Summary

The authors created a new AI tool that sorts galaxies into distance "bins" instead of guessing a single number. By training it on a massive new dataset of known galaxies (DESI) and adding infrared "heat" data, they made the most accurate distance map of the universe to date for these specific surveys. This map is now available for other scientists to use in studying how the universe is expanding and evolving.

Technical Summary: Photometric Redshift PDFs via Neural Network Classification for DESI Legacy Imaging Surveys and Pan-STARRS

Problem Statement
Accurate photometric redshift (photo-z) estimation is critical for modern extragalactic astronomy and cosmology, particularly for large-scale structure analyses, weak gravitational lensing, and galaxy evolution studies. While spectroscopic redshifts provide ground-truth precision, they are limited to millions of galaxies due to observational costs, whereas imaging surveys observe billions of sources. Traditional machine learning approaches for photo-z estimation typically frame the problem as a regression task, outputting single point estimates. This approach suffers from three primary limitations: (1) a lack of uncertainty quantification; (2) an inability to capture multi-modal posterior distributions arising from color-redshift degeneracies; and (3) systematic biases in sparsely sampled redshift regions. Furthermore, the performance of these methods is fundamentally constrained by the size, quality, and representativeness of the spectroscopic training sample.

Methodology
The authors propose a Neural Network Classification (NNC) method designed to produce well-calibrated redshift probability density functions (PDFs). The core methodology involves:

Discretization and Classification: Instead of direct regression, the continuous redshift space $[0, 2]$ is discretized into $N_{bin} = 400$ ordered bins. The neural network is trained to classify galaxies into these bins, outputting an $N_{bin}$ -dimensional probability vector representing the full redshift PDF.
Loss Function Optimization: The model is trained by minimizing the Continuous Ranked Probability Score (CRPS). Unlike cross-entropy, CRPS respects the ordinal nature of redshift by operating on the cumulative distribution function (CDF). This imposes distance-sensitive penalties, distinguishing between minor deviations and catastrophic failures, and balances sharpness with calibration.
Network Architecture: A multi-layer perceptron with residual connections serves as the backbone, consisting of four fully-connected hidden layers with batch normalization, ReLU activation, and dropout.
Calibration: To ensure statistical reliability, the authors apply temperature scaling to the network logits post-training, optimizing the temperature parameter to minimize negative log-likelihood on the validation set. This transforms the raw outputs into well-calibrated PDFs, verified via Probability Integral Transform (PIT) diagnostics.
Datasets: The method is applied to two major photometric surveys:
- DESI Legacy Imaging Surveys Data Release 10 (LSDR10): Utilizing $g, r, i, z_m$ optical bands and $W1, W2$ mid-infrared bands.
- Pan-STARRS Data Release 2 (PS1DR2): Utilizing $g, r, i, z_m, y$ optical bands, augmented with unWISE $W1, W2$ mid-infrared data for a subset of sources.
Training Samples: The models are trained on an unprecedented spectroscopic sample combining SDSS DR19 and DESI DR1 (11.4 million sources), filtered for reliability. The authors specifically investigate the impact of training sample composition (SDSS-only vs. DESI-only vs. combined) and the role of infrared photometry.

Key Results
The NNC method demonstrates superior performance compared to Random Forest (RF), XGBoost, and standard ANN regression:

LSDR10 Performance: The method achieves $\sigma_{NMAD} = 0.0153$ and an outlier fraction ( $\eta$ ) of $0.50\%$ . This represents a 45% reduction in scatter compared to RF, 31% compared to XGBoost, and 10% compared to standard ANN regression.
PS1DR2 Performance: Using optical bands alone yields $\sigma_{NMAD} = 0.0283$ . However, combining PS1DR2 with unWISE mid-infrared photometry significantly improves performance to $\sigma_{NMAD} = 0.0222$ and $\eta = 0.34\%$ , a $\sim22\%$ reduction in scatter.
Impact of Training Data: DESI DR1 significantly improves photo-z performance at $z > 1$ . Models trained solely on SDSS data show sharp degradation at $z > 1.1$ due to a lack of high-redshift training data, whereas models trained on DESI or combined samples maintain $\sigma_{NMAD} \lesssim 0.05$ out to $z \sim 1.5$ .
Survey Depth vs. Wavelength: While mid-infrared coverage is essential for breaking degeneracies, a performance gap remains between LSDR10 and PS1DR2 even with infrared data. SHAP analysis indicates that deeper photometry (LSDR10) allows the model to rely on direct spectral energy distribution (SED) features, whereas shallower photometry (PS1DR2) forces the model to rely more heavily on measurement uncertainties and aperture differences as indirect proxies.
PDF Quality: The method successfully captures multi-modal posteriors for sources with color-redshift degeneracies. Calibration diagnostics (PIT histograms) confirm that the temperature-scaled PDFs are well-calibrated and nearly uniform.

Significance and Claims
The paper claims that the NNC method, driven by CRPS optimization and trained on the expanded DESI DR1 spectroscopic sample, provides a robust framework for generating well-calibrated redshift PDFs. The authors assert that:

Superiority: The NNC approach outperforms traditional regression and other machine learning baselines in both accuracy and uncertainty quantification.
DESI Impact: The release of DESI DR1 is a critical milestone, filling gaps in spectroscopic coverage at $z > 0.8$ and enabling high-precision photo-z estimation for intermediate-to-high redshift galaxies.
Data Requirements: High-precision photo-z estimation across the full redshift range requires both broad wavelength coverage (specifically mid-infrared to break degeneracies) and deep photometric depth.
Unified Catalog: The authors provide a unified photometric redshift catalog covering $>30,000$ deg $^2$ by combining LSDR10 and PS1DR2 using a hierarchical model selection strategy based on available photometry.
Future Applicability: The framework is survey-agnostic and applicable to next-generation surveys (e.g., CSST, Euclid, LSST), though the authors note that future applications will depend on the availability of representative spectroscopic training sets at $z > 2$ , which may require future programs like DESI-II or 4MOST.

The paper concludes that the well-calibrated PDFs produced by this method are valuable for cosmological studies requiring proper uncertainty propagation, offering a significant step forward in handling the massive datasets anticipated from upcoming imaging surveys.

Photometric Redshift PDFs via Neural Network Classification for DESI Legacy Imaging Surveys and Pan-STARRS