Reinforcement Learning Control of Quantum Error Correction

This paper introduces a reinforcement learning framework that unifies quantum error correction with continuous system calibration, experimentally demonstrating a 3.5-fold improvement in logical stability on a superconducting processor and achieving record low logical error rates while proving the approach's scalability for future fault-tolerant quantum computing.

Volodymyr Sivak, Alexis Morvan, Michael Broughton, Rodrigo G. Cortiñas, Johannes Bausch, Andrew W. Senior, Matthew Neeley, Alec Eickbusch, Noah Shutty, Laleh Aghababaie Beni, James S. Spencer, Francisco J. H Heras, Thomas Edlich, Dmitry Abanin, Amira Abbas, Rajeev Acharya, Georg Aigeldinger, Ross Alcaraz, Sayra Alcaraz, Trond I. Andersen, Markus Ansmann, Frank Arute, Kunal Arya, Walt Askew, Nikita Astrakhantsev, Juan Atalaya, Brian Ballard, Joseph C. Bardin, Hector Bates, Andreas Bengtsson, Majid Bigdeli Karimi, Alexander Bilmes, Simon Bilodeau, Felix Borjans, Alexandre Bourassa, Jenna Bovaird, Dylan Bowers, Leon Brill, Peter Brooks, David A. Browne, Brett Buchea, Bob B. Buckley, Tim Burger, Brian Burkett, Nicholas Bushnell, Jamal Busnaina, Anthony Cabrera, Juan Campero, Hung-Shen Chang, Silas Chen, Ben Chiaro, Liang-Ying Chih, Agnetta Y. Cleland, Bryan Cochrane, Matt Cockrell, Josh Cogan, Roberto Collins, Paul Conner, Harold Cook, William Courtney, Alexander L. Crook, Ben Curtin, Martin Damyanov, Sayan Das, Dripto M. Debroy, Sean Demura, Paul Donohoe, Ilya Drozdov, Andrew Dunsworth, Valerie Ehimhen, Aviv Moshe Elbag, Lior Ella, Mahmoud Elzouka, David Enriquez, Catherine Erickson, Vinicius S. Ferreira, Marcos Flores, Leslie Flores Burgos, Ebrahim Forati, Jeremiah Ford, Austin G. Fowler, Brooks Foxen, Masaya Fukami, Alan Wing Lun Fung, Lenny Fuste, Suhas Ganjam, Gonzalo Garcia, Christopher Garrick, Robert Gasca, Helge Gehring, Robert Geiger, Élie Genois, William Giang, Dar Gilboa, James E. Goeders, Edward C. Gonzales, Raja Gosula, Stijn J. de Graaf, Alejandro Grajales Dau, Dietrich Graumann, Joel Grebel, Alex Greene, Jonathan A. Gross, Jose Guerrero, Loïck Le Guevel, Tan Ha, Steve Habegger, Tanner Hadick, Ali Hadjikhani, Michael C. Hamilton, Matthew P. Harrigan, Sean D. Harrington, Jeanne Hartshorn, Stephen Heslin, Paula Heu, Oscar Higgott, Reno Hiltermann, Hsin-Yuan Huang, Mike Hucka, Christopher Hudspeth, Ashley Huff, William J. Huggins, Evan Jeffrey, Shaun Jevons, Zhang Jiang, Xiaoxuan Jin, Chaitali Joshi, Pavol Juhas, Andreas Kabel, Dvir Kafri, Hui Kang, Kiseo Kang, Amir H. Karamlou, Ryan Kaufman, Kostyantyn Kechedzhi, Tanuj Khattar, Mostafa Khezri, Seon Kim, Can M. Knaut, Bryce Kobrin, Fedor Kostritsa, John Mark Kreikebaum, Ryuho Kudo, Ben Kueffler, Arun Kumar, Vladislav D. Kurilovich, Vitali Kutsko, Nathan Lacroix, David Landhuis, Tiano Lange-Dei, Brandon W. Langley, Pavel Laptev, Kim-Ming Lau, Justin Ledford, Joy Lee, Kenny Lee, Brian J. Lester, Wendy Leung, Lily Li, Wing Yan Li, Ming Li, Alexander T. Lill, William P. Livingston, Matthew T. Lloyd, Aditya Locharla, Laura De Lorenzo, Daniel Lundahl, Aaron Lunt, Sid Madhuk, Aniket Maiti, Ashley Maloney, Salvatore Mandrà, Leigh S. Martin, Orion Martin, Eric Mascot, Paul Masih Das, Dmitri Maslov, Melvin Mathews, Cameron Maxfield, Jarrod R. McClean, Matt McEwen, Seneca Meeks, Kevin C. Miao, Zlatko K. Minev, Reza Molavi, Sebastian Molina, Shirin Montazeri, Charles Neill, Michael Newman, Anthony Nguyen, Murray Nguyen, Chia-Hung Ni, Murphy Yuezhen Niu, Logan Oas, Raymond Orosco, Kristoffer Ottosson, Alice Pagano, Agustin Di Paolo, Sherman Peek, David Peterson, Alex Pizzuto, Elias Portoles, Rebecca Potter, Orion Pritchard, Michael Qian, Chris Quintana, Arpit Ranadive, Matthew J. Reagor, Rachel Resnick, David M. Rhodes, Daniel Riley, Gabrielle Roberts, Roberto Rodriguez, Emma Ropes, Lucia B. De Rose, Eliott Rosenberg, Emma Rosenfeld, Dario Rosenstock, Elizabeth Rossi, Pedram Roushan, David A. Rower, Robert Salazar, Kannan Sankaragomathi, Murat Can Sarihan, Kevin J. Satzinger, Max Schaefer, Sebastian Schroeder, Henry F. Schurkus, Aria Shahingohar, Michael J. Shearn, Aaron Shorter, Vladimir Shvarts, Spencer Small, W. Clarke Smith, David A. Sobel, Barrett Spells, Sofia Springer, George Sterling, Jordan Suchard, Aaron Szasz, Alexander Sztein, Madeline Taylor, Jothi Priyanka Thiruraman, Douglas Thor, Dogan Timucin, Eifu Tomita, Alfredo Torres, M. Mert Torunbalci, Hao Tran, Abeer Vaishnav, Justin Vargas, Sergey Vdovichev, Guifre Vidal, Catherine Vollgraff Heidweiller, Meghan Voorhees, Steven Waltman, Jonathan Waltz, Shannon X. Wang, Brayden Ware, James D. Watson, Yonghua Wei, Travis Weidel, Theodore White, Kristi Wong, Bryan W. K. Woo, Christopher J. Wood, Maddy Woodson, Cheng Xing, Z. Jamie Yao, Ping Yeh, Bicheng Ying, Juhwan Yoo, Noureldin Yosri, Elliot Young, Grayson Young, Adam Zalcman, Ran Zhang, Yaxing Zhang, Ningfeng Zhu, Nicholas Zobrist, Zhenjie Zou, Ryan Babbush, Dave Bacon, Sergio Boixo, Yu Chen, Zijun Chen, Michel Devoret, Monica Hansen, Jeremy Hilton, Cody Jones, Julian Kelly, Alexander N. Korotkov, Erik Lucero, Anthony Megrant, Hartmut Neven, William D. Oliver, Ganesh Ramachandran, Vadim Smelyanskiy, Paul V. Klimov

Published Tue, 10 Ma
📖 4 min read🧠 Deep dive

Imagine you are trying to keep a house of cards standing in the middle of a windy room. The cards represent your quantum computer's data, and the wind represents the constant, tiny jitters and errors caused by the environment (temperature changes, electrical noise, etc.).

In the past, if the cards started to wobble, the only way to fix them was to stop everything. You would freeze the room, carefully re-stack the cards, check every single one, and then start again. But for the complex calculations of the future (which might take days or weeks), stopping every time the wind blows is impossible. You'd never finish the job.

This paper from Google Quantum AI and Google DeepMind introduces a revolutionary new way to handle this: Teaching the quantum computer to "surf" the wind instead of fighting it.

Here is the breakdown of how they did it, using simple analogies:

1. The Problem: The "Drifting" Tuning Fork

Quantum computers are incredibly sensitive analog machines. Think of them like a giant, ultra-precise orchestra. To play a perfect song (a calculation), every instrument (qubit) must be perfectly tuned.

  • The Old Way: Every hour, the conductor stops the music, checks every instrument, re-tunes them, and then starts over. This is slow and wasteful.
  • The New Problem: The instruments don't just stay out of tune; they drift. The temperature changes, and the tuning shifts while you are playing. Stopping to fix it breaks the flow.

2. The Solution: The "Self-Correcting" Conductor

The researchers created an Artificial Intelligence (AI) agent using a technique called Reinforcement Learning (RL).

  • The Metaphor: Imagine a conductor who doesn't just listen to the music, but also feels the wind in the room. Instead of stopping the orchestra, the conductor makes tiny, invisible adjustments to the instruments while they are playing.
  • How it learns: The AI doesn't need to know the physics of the wind. It just watches the "mistakes." In quantum computing, when an error happens, it leaves a tiny "footprint" (called a detection event). The AI treats these footprints as a score.
    • Fewer footprints? Good job! (Reward).
    • More footprints? Try adjusting the knobs differently. (Penalty).

3. The Magic Trick: Turning Errors into a Map

Usually, errors are bad. But this system turns errors into a GPS map.

  • The AI looks at where the errors are happening and asks, "Which control knob caused this?"
  • It then nudges that knob slightly in the opposite direction to fix it.
  • Because the AI is constantly doing this, it creates a feedback loop: The computer learns from its own mistakes in real-time, without ever stopping.

4. The Results: A 3.5x Boost

They tested this on a superconducting processor called "Willow."

  • The Experiment: They intentionally made the system drift (like turning up the wind) to see if the AI could handle it.
  • The Outcome: The AI-controlled system was 3.5 times more stable than the old method. It kept the house of cards standing even when the wind got stronger.
  • Record Breaking: They achieved the lowest error rates ever recorded for these types of quantum codes, proving that an AI can tune a quantum computer better than human experts can.

5. Why This Matters: The "Never-Stop" Computer

The most exciting part is scalability.

  • The Old Fear: As quantum computers get bigger (with thousands of qubits), the number of knobs to turn becomes millions. Humans can't tune millions of knobs, and stopping to tune them would take forever.
  • The New Hope: The AI doesn't care how big the computer gets. It learns the pattern of the errors and adjusts the knobs automatically. The paper shows simulations that this method works just as well for a massive computer as it does for a small one.

The Bottom Line

This paper is a major step toward Fault-Tolerant Quantum Computing. It moves us from a world where we have to pause and fix our computers constantly, to a world where the computer is self-healing.

It's like upgrading from a car that needs a mechanic to stop and adjust the engine every 10 miles, to a car with a self-driving AI that constantly adjusts the suspension, fuel, and steering while you drive at 100 mph, ensuring you never crash, no matter how bumpy the road gets.

In short: They taught the quantum computer to learn from its own mistakes and fix itself on the fly, paving the way for machines that can run complex calculations for days without ever stopping.