Accurate spectroscopic redshift estimation using non-negative matrix factorization: application to MUSE spectra

Imagine you are an astronomer looking at a massive library of light. This isn't a library of books, but a library of galaxies. Each galaxy sends a unique "fingerprint" of light to our telescopes, called a spectrum. By reading these fingerprints, we can tell how fast the galaxy is moving away from us, which tells us its redshift (a measure of distance and age).

However, reading these fingerprints is incredibly hard. The light gets stretched, distorted, and sometimes mixed up with light from other galaxies. It's like trying to identify a song when it's playing in a crowded room with static noise, or when two songs are playing at the same time.

This paper introduces a new, smart way to solve this puzzle using a technique called Non-negative Matrix Factorization (NMF). Here is how it works, broken down into simple concepts:

1. The Problem: The "Cosmic Mix-Up"

For decades, astronomers have tried to match galaxy light to a library of pre-made templates (like matching a puzzle piece to a picture on the box). But galaxies are messy. Some are bright and blue (young stars), some are red and old, and some are just glowing gas clouds.

The Challenge: In deep space, we see galaxies at all different distances. A galaxy far away might look like a nearby one because its light has been stretched. It's like trying to tell if a person is wearing a red shirt or if they are just wearing a white shirt under a red sunset.
The "Redshift Desert": There is a specific range of distances where galaxies don't have any obvious "landmarks" (like bright emission lines) in their light. It's like trying to navigate a desert with no trees or rocks to mark your path.

2. The Solution: Learning the "Lego Bricks" of Light

Instead of using pre-made templates, the authors let the computer learn what galaxies look like directly from the data. They used a method called NMF.

The Analogy: The Lego Wall
Imagine you have a giant wall made of millions of different colored Lego bricks. You don't know the recipe for the wall, but you want to figure out how to rebuild it.

PCA (The old way): Imagine trying to describe the wall by saying, "It's 50% blue, 30% red, and 20% yellow." This is mathematically okay, but it's abstract. You can't point to a specific "blue" brick and say, "That's the blue part."
NMF (The new way): NMF says, "Let's find the actual Lego bricks that make up the wall." It breaks the complex wall down into a small set of fundamental, positive-only building blocks (basis vectors).
- One "brick" might represent a galaxy full of young, blue stars.
- Another "brick" might represent an old, red galaxy.
- Another "brick" might represent the specific glow of oxygen gas.

Because NMF only uses "positive" numbers (you can't have negative Lego bricks), the results are very easy to understand. It finds the actual physical parts that make up the galaxy's light.

3. How They Find the Distance (Redshift)

Once the computer has learned these "Lego bricks" (the basis vectors), it can guess the distance of a new, unknown galaxy.

The Analogy: The Tuning Fork

The computer takes a new galaxy's light spectrum.
It tries to rebuild that spectrum using its learned "Lego bricks," but it has to guess the distance first.
It tries a guess: "What if this galaxy is at distance A?" It stretches the bricks to match that distance and tries to rebuild the spectrum.
- If the guess is wrong: The bricks won't fit together well. The reconstruction will look messy and wrong.
- If the guess is right: The bricks snap perfectly into place, recreating the galaxy's light exactly.
The computer tests thousands of distances (like tuning a radio) and picks the one where the "reconstruction error" is the lowest. That's the correct distance!

4. The Results: A Super-Helper

The team tested this method on data from the MUSE telescope, which looks at galaxies from very close by to the edge of the observable universe (redshift 0 to 6.7).

Success Rate: It got the right answer 93.7% of the time. That's a huge improvement over older methods, especially for those tricky "desert" galaxies with no landmarks.
Spotting Fakes: The telescope sometimes sees "ghosts"—faint smudges of light that aren't real galaxies (just noise). The new method can tell the difference. If the "Lego bricks" can't build a good picture of the light, the computer knows, "This is probably a fake," and flags it.
Untangling Blends: Sometimes two galaxies are so close they look like one blob of light. The method can often say, "Wait, this looks like two different galaxies mixed together," and separate them, much like unmixing two voices in a recording.

5. Why This Matters

This isn't just about one telescope. The next generation of telescopes will collect millions of spectra. Humans can't look at them all. We need a robot that is fast, smart, and understands the physics of light.

This paper shows that by teaching the computer to find the fundamental "building blocks" of galaxy light, we can automatically and accurately measure the distance to the universe's most distant objects. It's like giving astronomers a super-powered pair of glasses that can instantly read the cosmic address of any galaxy they see.

Here is a detailed technical summary of the paper "Accurate spectroscopic redshift estimation using non-negative matrix factorization: application to MUSE spectra."

1. Problem Statement

Accurate and automated redshift determination is critical for maximizing the scientific return of large spectroscopic surveys. While Multi-Object Spectroscopy (MOS) surveys (e.g., SDSS, DESI) focus on pre-selected bright objects, Integral Field Spectroscopy (IFS) instruments like MUSE (Multi Unit Spectroscopic Explorer) provide spectra for all objects in a field, reaching magnitudes as faint as 28.

This creates unique challenges:

Wide Redshift Range: MUSE covers $z = 0$ to $6.7 $, leading to severe line confusion (e.g., distinguishing [O II]$ \lambda\lambda3727,3729 $at$ z < 1.5 $from Ly$ \alpha\lambda1216 $at$ z > 2.8$).
The "Redshift Desert": In the range $1.5 < z < 2.8$, galaxies often lack strong emission lines, making redshift determination dependent on continuum shape.
Data Quality Issues: IFS data often contains artifacts (sky subtraction residuals, flat-fielding errors) and blended sources (multiple objects in one spectrum).
Limitations of Existing Tools: Current tools often rely on spectral template fitting (PCA, synthetic archetypes) or deep learning. Many require continuum subtraction (losing information) or need massive labeled datasets (>20,000 spectra) and struggle with the specific noise characteristics and line confusion of deep IFS fields.

2. Methodology

The authors propose a data-driven approach using Non-negative Matrix Factorization (NMF) to learn a rest-frame representation of galaxy spectra and estimate redshifts based on reconstruction error.

A. Data Preparation

Dataset: ~9,252 MUSE galaxy spectra from five Guaranteed Time Observations (GTO) surveys (HUDF, MEGAFLOW, MUSCATEL, MUSE-WIDE, MAGIC).
Selection: Galaxies with secure redshifts ( $ZCONF \ge 1$ ) and visually inspected to remove obvious blends.
Transformation: Spectra are transformed to a common logarithmic rest-frame wavelength grid ($2.77 \le \log_{10}\lambda \le 3.97$). Flux densities are adjusted to conserve energy during the transformation.

B. Non-negative Matrix Factorization (NMF)

The core of the method is decomposing the data matrix $X$ (spectra) into two non-negative matrices $W$ and $H$ :
$X \simeq WH$

Algorithm: The authors use "nearly-NMF" (Green & Bailey 2024), an extension of standard NMF that handles heteroscedastic uncertainties, missing values, and negative flux values (common in sky-subtracted data) without zero-clipping.
Basis Vectors ( $H$ ): These represent the fundamental "parts" of galaxy spectra (e.g., specific emission lines, continuum shapes).
Rank Selection: A 5-fold cross-validation was performed to determine the optimal number of basis vectors ( $k$ ). Rank $k=10$ was chosen as it provided the best balance between the Good Fraction (GF) and Mean Absolute Error (MAE).

C. Redshift Prediction Workflow

Trial Redshifts: For a new observed spectrum, the method tests redshifts from $z=0$ to $6.7$ (step size 0.0005).
De-redshifting: The spectrum is shifted to the rest frame assuming a trial redshift $z_{test}$ .
Reconstruction: The de-redshifted spectrum is projected onto the learned NMF basis vectors ( $H$ ) using Non-Negative Least Squares (NNLS) to find coefficients $W$ .
Error Minimization: The reconstruction error ( $\chi^2$ ) is calculated. The redshift $z_p$ that minimizes $\chi^2$ is selected as the predicted redshift.

D. Evaluation Metrics

To assess prediction quality, two scores are derived from the $\chi^2(z)$ curve:

Significance Score ( $\Delta\chi^2$ ): Measures how deep the minimum is relative to the baseline (first quartile of $\chi^2$ ).
Robustness Score ( $R$ ): Measures the separation between the first and second minima of the $\chi^2$ curve, normalized by the intrinsic dispersion.

3. Key Contributions

NMF-Based Framework: Introduced a novel application of NMF for redshift estimation that learns templates directly from data rather than relying on synthetic models or PCA (which allows negative values and is less interpretable).
Handling IFS Specifics: The method explicitly handles negative flux values and missing data via the "nearly-NMF" algorithm, making it robust for MUSE data.
Full Spectrum Utilization: Unlike many deep learning or template methods that subtract the continuum, this method uses the full spectrum (continuum + lines), which is crucial for distinguishing between Ly $\alpha$ and [O II] in the redshift desert.
Blended Source Detection: Developed a secondary application to detect blended sources by re-scanning the spectrum after fitting the dominant component and augmenting the NMF basis with the fitted spectrum.
Open Source Implementation: The authors released the code in a Julia package (Moose.jl) with a Python wrapper.

4. Results

The method was tested on an independent set of 1,454 MUSE spectra ( $ZCONF \ge 2$ ).

Overall Accuracy: Achieved a Good Fraction (GF) of 93.7% (where error $\Delta z < 0.005(1+z)$ ).
Performance by Redshift:
- Performance remains $>90\%$ across most bins.
- Dips occur in the "redshift desert" ( $z \sim 2$ and $2.8 $) due to lack of spectral features and at very high redshifts ($ z \ge 6$) due to small sample sizes.
Dependence on SNR:
- Performance is highly dependent on line SNR ( $SNR_{lines}$ ). GF approaches unity when $SNR_{lines} \sim 13$ .
- Performance is relatively stable with continuum SNR.
False Source Separation: Using a threshold of $\log_{10}\Delta\chi^2 = -2$ , the method separates true sources ( $ZCONF \ge 1$ ) from false detections ( $ZCONF = 0$ ) with 95.9% completeness and 96.0% purity.
Blended Source Detection: Applied to MXDF sources, the method achieved an AUC of 0.87. At an optimal threshold, it recovered 78% of blended sources with a 18% false positive rate.
New Data Validation: When applied to the improved MUSE-WIDE DR2 reduction, the GF rose to 97.1% for secure redshifts, demonstrating sensitivity to data quality improvements.
Computational Efficiency: Processing time is ~200 ms per spectrum (testing 7,000 redshifts) on a 24-thread CPU.

5. Significance and Conclusion

This paper demonstrates that NMF provides a powerful, physically motivated, and data-driven framework for automated redshift estimation in large spectroscopic surveys, particularly for IFS data like MUSE.

Robustness: The method outperforms or complements existing tools in the challenging $0 < z < 6.7$ range, specifically addressing the line confusion and continuum-dependent redshifts that plague classical template fitting.
Interpretability: The NMF basis vectors naturally capture kinematic shifts and radiative transfer effects (e.g., blue/red shifts in Ly $\alpha$ and [O II]), offering physical insights beyond simple classification.
Scalability: With high accuracy, intrinsic reliability diagnostics ( $\Delta\chi^2$ , $R$ ), and the ability to detect blends and false positives, this method is well-suited for the upcoming generation of massive spectroscopic surveys (e.g., DESI, 4MOST, MOONS, and future ELT instruments).

The authors conclude that while limitations exist regarding severe data artifacts (sky residuals), the approach offers a flexible alternative to deep learning that requires fewer training examples and preserves full spectral information.