Comparative Performance Analysis of NIST PQC Standards: From STM32 Software Limitations to FPGA-SoC Acceleration

This paper demonstrates that while NIST-standardized post-quantum signature schemes like SPHINCS+ and Dilithium are impractical for resource-constrained ARM Cortex-M4 microcontrollers due to severe performance and memory limitations, a hardware-software codesign approach utilizing an FPGA-accelerated NTT core on a Zynq-7000 SoC enables efficient, millisecond-level execution suitable for quantum-resistant embedded systems.

Original authors: Mustafa Akif Yıldırım, Osman Tokluoglu

Published 2026-06-16
📖 4 min read🧠 Deep dive

Original authors: Mustafa Akif Yıldırım, Osman Tokluoglu

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine the world of digital security as a giant vault. For decades, the locks on this vault (like RSA and ECC) have been incredibly strong, but a new kind of thief is emerging: the Quantum Computer. This thief has a master key that can pick these old locks in seconds. To stop them, scientists at NIST (the US standards body) have designed new, super-complex locks called Post-Quantum Cryptography (PQC).

This paper is a report card on trying to install these new, heavy-duty locks on two very different types of "doorframes": a small, budget-friendly microcontroller (like the brain inside a smart thermostat) and a powerful, high-tech computer chip (like the brain inside a modern server or advanced drone).

Here is the breakdown of their experiment using simple analogies:

1. The Two New Locks

The researchers tested two specific types of new locks:

  • Dilithium (The Math Puzzle): This lock is based on complex lattice math. It's like trying to solve a massive, multi-dimensional jigsaw puzzle where the pieces are huge polynomials. It requires a lot of workspace (memory) to hold all the pieces while you solve it.
  • SPHINCS+ (The Hash Tree): This lock is based on hashing (scrambling data). It's like building a massive tree where every branch is a tiny signature. To sign a message, you have to climb up and down this tree thousands of times, doing a lot of heavy lifting (hashing) at every step.

2. The First Attempt: The "Tiny Workshop" (STM32 Microcontroller)

The researchers first tried to install these locks on a standard, low-cost chip called the STM32. Think of this chip as a tiny, one-room workshop with a very small workbench (192 KB of memory) and a single, slow worker.

  • The Dilithium Failure: When they tried to bring the "Math Puzzle" into this tiny workshop, the puzzle pieces were simply too big. The worker tried to lay them out on the workbench, but the bench was too small. The worker's head hit the ceiling, and the whole system crashed. In technical terms, the chip ran out of memory (stack overflow) immediately.
  • The SPHINCS+ Failure: The "Hash Tree" didn't crash the workshop, but it was agonizingly slow. Because the worker had to climb the tree thousands of times without any help, it took about 10 minutes just to sign a single message. By the time they tried to verify the signature, the system gave up entirely. It was too slow to be useful in real life.

The Lesson: Trying to run these new, heavy quantum-proof locks on a standard, small microcontroller is like trying to build a skyscraper in a garden shed. It just doesn't have the space or the speed.

3. The Second Attempt: The "Super-Factory" (FPGA-SoC)

Realizing the tiny workshop couldn't handle the job, the researchers moved to a Zynq-7000 SoC. Think of this as a massive, high-tech factory that has two distinct parts working together:

  • The Manager (Processor System): A standard computer brain that handles the paperwork, organizes the messages, and tells the workers what to do.
  • The Specialized Robots (FPGA Fabric): A custom-built area where they can build specialized machines specifically designed for the job.

The Solution: Hardware-Software Co-Design
Instead of asking the Manager to do the heavy lifting, they built custom robots (accelerators) inside the factory to do the hard math:

  • They built a Robot specifically for the "Math Puzzle" (NTT) to spin the polynomials instantly.
  • They built another Robot specifically for the "Hash Tree" (Keccak) to scramble data at lightning speed.

The Result:

  • The Manager just handed the data to the Robots.
  • The Robots did the heavy lifting in parallel (all at once).
  • The results came back in milliseconds instead of minutes.
    • Key Generation: ~1 millisecond.
    • Signing: ~6 milliseconds.

The Bottom Line

The paper concludes that while the "Tiny Workshop" (standard microcontrollers) is great for simple tasks, it is completely unprepared for the heavy math required by future quantum-proof security.

To make these new locks work in the real world, you can't just rely on software; you need Hardware-Software Co-Design. You need a system where a standard computer brain manages the process, but specialized hardware robots (FPGAs) do the heavy lifting. Without these specialized robots, the new locks are too slow or too big to use on everyday devices.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →