Estimating the completeness of the QUBRICS Survey with 3501 QSO redshifts from Gaia DR3 spectra

This paper evaluates the completeness of the QUBRICS survey by analyzing 3,501 QSO redshifts from Gaia DR3 spectra, confirming the high efficiency of its selection methods (89% recall) and providing a spectroscopic completeness estimate of 82% alongside reliable redshifts for 1,223 new quasars to support future cosmological studies.

Matteo Porru, Stefano Cristiani, Francesco Guarneri, Giorgio Calderone, Andrea Grazian, Konstantina Boutsia, Andrea Trost, Valentina D'Odorico, Guido Cupani, Catarina M. J. Marques, Francesco Chiti Tegli, Fabio Fontanot

Published Tue, 10 Ma
📖 5 min read🧠 Deep dive

Here is an explanation of the paper, translated into everyday language with some creative analogies.

The Big Picture: Finding the Universe's "Lighthouses"

Imagine the universe is a vast, dark ocean. To navigate it and understand how it was formed, astronomers need "lighthouses." In astronomy, these lighthouses are Quasi-Stellar Objects (QSOs), also known as quasars. They are incredibly bright, ancient supermassive black holes that shine billions of light-years away.

For a long time, astronomers had a problem: they had built a very good map of lighthouses in the Northern Hemisphere (the sky above the equator), but the Southern Hemisphere was a dark, uncharted territory. They knew they were missing a lot of lighthouses down south.

Enter the QUBRICS Survey. Think of QUBRICS as a new, high-tech lighthouse-hunting team launched in 2019 specifically to scan the southern sky. They used powerful computers and machine learning to predict where these bright objects might be hiding, then pointed telescopes at those spots to confirm them.

The Problem: "Did We Miss Any?"

The team had found over 1,300 new quasars, which was great. But for scientists to use this data to calculate the history of the universe (like how fast it's expanding), they needed to know one crucial thing: How complete is our list?

Did they find all the lighthouses they could see? Or did their computer algorithms miss some? If they missed 30% of the lighthouses, their calculations about the universe would be wrong.

To answer this, they needed a "gold standard" test. They couldn't just ask their own computer, "Did you do a good job?" because the computer might be biased. They needed an independent referee.

The Solution: The "Gaia Spectroscope" Referee

The authors of this paper acted as the referees. They used data from Gaia, a European space observatory that has been taking low-resolution "snapshots" (spectra) of millions of stars and galaxies.

Think of it this way:

  • The QUBRICS Team: Used a high-powered, specialized metal detector (Machine Learning) to scan the ground for gold (quasars).
  • The Gaia Team: Walked the same ground with a simple, reliable metal detector that everyone trusts, but didn't know about the QUBRICS team's specific search rules.

The researchers took 3,501 objects that Gaia's spectra confirmed were definitely quasars. They then asked: "How many of these did the QUBRICS team's computer algorithms actually flag as candidates?"

The Results: The Scorecard

They tested two different "metal detectors" (algorithms) used by QUBRICS:

1. The XGB Algorithm (The "Smart Predictor")

  • The Test: They looked at 152 quasars that Gaia found but QUBRICS hadn't yet classified.
  • The Result: The XGB algorithm correctly identified 89% of them as candidates.
  • The Analogy: Imagine a security guard at a club. If 100 VIPs try to get in, this guard lets 89 of them through. He's very good, but he still misses a few people who look a bit like regular guests.

2. The PRF Algorithm (The "Probabilistic Classifier")

  • The Test: They looked at 69 similar quasars.
  • The Result: The PRF algorithm correctly identified 66% of them.
  • The Analogy: This guard is a bit more cautious. Out of 100 VIPs, he only lets 66 through, turning away 34 who actually belong there. He's less efficient at finding the "hidden" ones.

The "Missed" Ones: Why did they get lost?

The paper found that the algorithms mostly missed quasars that were right on the edge of the "high redshift" zone (around a specific distance threshold).

  • The Analogy: Imagine the algorithms are trained to spot "tall people." If someone is 5'11" and the cutoff is 6'0", the computer might get confused and think they are short. Most of the missed quasars were just slightly below the "tall" threshold, making them look like ordinary stars to the computer.

The Bonus: A Treasure Trove of New Discoveries

While checking their work, the researchers didn't just find errors; they found 1,223 brand new quasars that no one had ever officially cataloged before!

  • The Analogy: While checking the security logs, the referees realized, "Hey, we actually found 1,200 VIPs that the club didn't even know were in the building!"
  • These new discoveries are now added to the database, making the map of the southern sky even brighter.

The Bottom Line

This paper is a "quality control" report. It confirms that the QUBRICS survey is doing an excellent job (finding about 89% of the hidden lighthouses with its best tool).

  • Why it matters: Now that we know the survey is 89% complete, cosmologists can trust the data to study the universe's expansion and the history of galaxies.
  • Future plans: The team knows where their "metal detectors" are weak (near the distance threshold). They plan to feed the new data they found into the computers to train them better, so next time they might find 95% or even 99% of the lighthouses.

In short: They built a great map of the southern sky, checked it against a trusted referee, found out it's 90% accurate, and discovered a bunch of new islands along the way.