⚛️ quantum physics

Benchmarking the Lights Out Problem on Real Quantum Hardware

This paper benchmarks Grover's search for the Lights Out problem on real IBM and IQM quantum hardware, revealing significant generational improvements in IBM devices, calibration-dependent performance variability, and the reliability of the IQM Garnet device despite output distributions close to uniform on other IQM systems.

Original authors: Maksims Dimitrijevs, Maria Palchiha, Abuzer Yakaryilmaz

Published 2026-02-19

📖 5 min read🧠 Deep dive

CC BY 4.0

Original authors: Maksims Dimitrijevs, Maria Palchiha, Abuzer Yakaryilmaz

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). ✨ This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you have a giant, magical light switch board. This is the "Lights Out" puzzle: you have a grid of lights, some on and some off. When you click one switch, it flips that light and its neighbors. Your goal? Turn all the lights off.

Now, imagine trying to solve this puzzle not with your brain, but with a quantum computer—a super-advanced machine that uses the weird rules of physics to calculate things.

This paper is like a report card for two different brands of these quantum computers (IBM and IQM). The authors tried to solve the "Lights Out" puzzle on these machines to see how good they really are at doing real work.

Here is the story of their experiment, broken down simply:

1. The Test: Two Different Puzzles

The researchers built two versions of the puzzle to test the machines:

The Small Puzzle (2x2 Grid): A tiny board with 4 lights. This is like a warm-up exercise.
The Tricky Puzzle (Möbius Ladder): A board where the lights are arranged in a circle, but with a twist (like a Möbius strip). This is a bit harder and requires more "thinking power" (qubits).

They used Grover's Search, which is like a super-fast detective algorithm. Instead of checking every light one by one (like a normal computer), Grover's algorithm checks all possibilities at once to find the solution instantly.

2. The Contenders: IBM vs. IQM

They tested these puzzles on real, publicly available quantum computers from two companies:

IBM: They used three different machines. Two were "newer models" (Heron r2) and one was an "older model" (Heron r1).
IQM: They used three different machines with a different design (like a square grid vs. a star shape).

3. The Results: Who Won?

The IBM Story: "Newer isn't always better"

The Upgrade: The newer IBM machines (Heron r2) generally did a better job than the older one. It's like upgrading from a 2023 car to a 2024 model; the engine is smoother.
The Surprise: However, the authors found that not all new cars are fast. One of the "new" IBM machines actually performed worse than the "old" one sometimes!
The Calibration Factor: The most important discovery was about calibration. Think of calibration like tuning a guitar. Even if you have a brand-new, expensive guitar, if it's out of tune, it sounds terrible.
- On some days, a machine was perfectly tuned and solved the puzzle well.
- On other days (or after maintenance), the same machine was "out of tune" and produced random noise.
- Lesson: You can't just pick the "newest" machine; you have to check if it's "tuned" that day.

The IQM Story: "Great design, but noisy"

The Design: IQM machines are very good at organizing the puzzle pieces. When the researchers translated their puzzle for IQM, the instructions became shorter and more efficient than for IBM. It's like IQM found a shortcut through the maze.
The Problem: Despite having a better map, the IQM machines were too "noisy." The results came out looking like a random shuffle of cards (a uniform distribution). They couldn't solve the puzzle reliably.
The Diagnosis: To figure out why, they ran a tiny, simple test. They found that one IQM machine (Garnet) was more reliable than the others, but none of them were quite ready for the big challenge yet.

4. The Big Takeaways

The paper teaches us three main things about the current state of quantum computing:

Hardware is improving, but it's messy. We are making progress (IBM's newer chips are better), but we are still in the "Noisy" era. The machines are like toddlers learning to walk; they can take steps, but they stumble often.
The "Tuning" matters more than the "Brand." A newer, more expensive quantum computer isn't guaranteed to win. If it hasn't been calibrated (tuned) well that day, it might perform worse than an older, well-tuned one.
Size isn't everything. The "Möbius Ladder" puzzle was harder for the machines, not because it had more lights, but because the connections between the lights were more complex. It showed that how the lights are connected matters just as much as how many there are.

The Bottom Line

The authors successfully used the "Lights Out" game to test the limits of today's quantum computers. They found that while we are getting closer to solving real problems, we still have to deal with "noise" and "bad tuning."

It's like trying to bake a perfect cake in a kitchen where the oven temperature fluctuates wildly. You can get a good cake if you check the oven constantly and adjust your recipe, but you can't just trust the oven to do the work for you yet.

Where to see the results?
The authors put all their code, data, and "cake recipes" online so anyone can try to solve the puzzle on these machines themselves!

Benchmarking the Lights Out Problem on Real Quantum Hardware

1. The Test: Two Different Puzzles

2. The Contenders: IBM vs. IQM

3. The Results: Who Won?

The IBM Story: "Newer isn't always better"

The IQM Story: "Great design, but noisy"

4. The Big Takeaways

The Bottom Line

1. Problem Definition

2. Methodology

A. Problem Instances

B. Hardware and Execution

C. Transpilation and Optimization

3. Key Contributions

4. Key Results

IBM Hardware Performance

IQM Hardware Performance

Calibration and Consistency

5. Significance and Conclusion

1. The Test: Two Different Puzzles

2. The Contenders: IBM vs. IQM

3. The Results: Who Won?

The IBM Story: "Newer isn't always better"

The IQM Story: "Great design, but noisy"

4. The Big Takeaways

The Bottom Line

1. Problem Definition

2. Methodology

A. Problem Instances

B. Hardware and Execution

C. Transpilation and Optimization

3. Key Contributions

4. Key Results

IBM Hardware Performance

IQM Hardware Performance

Calibration and Consistency

5. Significance and Conclusion

More like this