Metriq: A Collaborative Platform for Benchmarking Quantum Computers

The paper introduces Metriq, an open-source collaborative platform that unifies benchmark definition, execution, and data collection to enable reproducible, cross-platform performance assessment of quantum computers through a diverse suite of metrics and a composite Metriq Score.

Alessandro Cosentino, Changhao Li, Vincent Russo, Bradley A. Chase, Tom Lubinski, Siyuan Niu, Neer Patel, Nathan Shammah, William J. Zeng

Published Tue, 10 Ma
📖 5 min read🧠 Deep dive

Imagine the world of quantum computing as a bustling, chaotic city where every builder (like IBM, Quantinuum, Rigetti, etc.) is constructing their own unique skyscraper. Some are made of glass, some of steel, some of wood. They all claim to be the "tallest" or "fastest," but they measure height with different rulers, use different units of time, and refuse to let anyone else inspect their blueprints.

This is the problem the paper "Metriq" addresses. It introduces a new, open-source platform designed to be the universal "Consumer Reports" for quantum computers.

Here is a breakdown of how Metriq works, using simple analogies:

1. The Problem: A Tower of Babel

Right now, if you want to know which quantum computer is best, it's like trying to compare cars where one manufacturer measures speed in "miles per hour," another in "how many clouds they can fly through," and a third just says, "Trust us, we're fast."

  • The Issue: There is no standard way to compare them. The data is scattered, inconsistent, and often hidden behind paywalls or specific company tools.
  • The Result: It's impossible to get a clear, honest picture of who is actually winning the race.

2. The Solution: Metriq (The Universal Translator)

Metriq is a collaborative platform built by an independent group (the Unitary Foundation) that acts as a neutral referee. It doesn't belong to any one company. It has three main parts that work together like a well-oiled machine:

  • The Runner (metriq-gym): Think of this as a universal remote control. Instead of needing a different remote for every TV brand, Metriq has one remote that can talk to any quantum computer, whether it's made by IBM, Google, or a university lab. It sends the same set of tests to every machine.
  • The Dataset (metriq-data): This is the public library of results. Every time a test is run, the results are saved in a standardized format. No more hiding data in private spreadsheets. Everyone can see the raw numbers.
  • The Website (metriq-web): This is the interactive dashboard. It takes all that raw data and turns it into easy-to-read charts, graphs, and leaderboards. You can filter by "fastest," "most accurate," or "best for machine learning."

3. The Tests: The "Driver's License" Exam

Metriq doesn't just run one test; it runs a whole battery of exams to see how the "cars" handle different terrains. They have two types of tests:

A. The "Engine Room" Tests (System-Level)
These check the basic health of the machine, like checking oil pressure or tire tread.

  • BSEQ (Bell State Effective Qubits): Imagine trying to hold hands with a partner across a crowded room. Can you stay connected without letting go? This test checks if the computer can keep qubits "entangled" (connected) across the whole chip.
  • EPLG (Error Per Layered Gate): This is like counting how many steps you trip over while walking a tightrope. It measures how many errors happen when the computer performs a sequence of basic moves.
  • CLOPS (Circuit Layer Operations Per Second): This is the speedometer. It measures how fast the computer can actually execute instructions, including the time it takes to load the program and wait in line.

B. The "Road Trip" Tests (Application-Level)
These tests see if the computer can actually do something useful, not just run in circles.

  • QML Kernel: Can the computer recognize patterns in data? (Like a very basic version of AI).
  • WIT (Wormhole Teleportation): A fancy physics test. It tries to simulate a "wormhole" (a shortcut through space) to see if the computer can preserve information while moving it around.
  • LR-QAOA: A puzzle-solving test. Can the computer find the best solution to a complex optimization problem (like finding the shortest route for a delivery truck)?

4. The Score: The "Metriq Score"

How do you combine all these different tests into one number?
Imagine you are rating a restaurant. You have scores for:

  • Food quality (0-10)
  • Service speed (0-10)
  • Ambiance (0-10)

Metriq takes all the different test results and combines them into a single Metriq Score.

  • The Twist: They weigh the tests based on difficulty. Passing a test with 100 qubits is worth much more than passing one with 5 qubits, just like driving a Formula 1 car is harder than driving a go-kart.
  • The Baseline: They pick one computer (currently an IBM machine called "Torino") as the "standard" and give it a score of 100. If another machine scores 120, it's 20% better than the standard. If it scores 50, it's half as good.

5. Why This Matters

  • Transparency: No more "black box" results. If a company claims their computer is the best, Metriq lets you see the raw data to verify it.
  • Community Power: Anyone can suggest a new test or run a test on their own lab equipment. It's a living, breathing project that grows with the technology.
  • Future-Proofing: As quantum computers get bigger and better, Metriq updates its tests. It's designed to track progress over years, not just take a snapshot today.

The Bottom Line

Metriq is the first time the quantum world has a fair, open, and standardized way to say, "Okay, let's see who is actually the best at what they claim to do."

It turns a chaotic race with different rules into a clear, transparent competition where the best technology can finally shine. And the best part? It's free for everyone to use, check, and improve.