Energy Efficient Exact and Approximate Systolic Array Architecture for Matrix Multiplication

This paper proposes an energy-efficient 8x8 systolic array architecture utilizing novel exact and approximate processing elements (PPC and NPPC) that achieve significant power savings of 22% and 32% respectively while maintaining high output quality for error-resilient image and vision processing applications like DCT and edge detection.

Pragun Jaswal, L. Hemanth Krishna, B. Srinivasu

Published 2026-03-24
📖 4 min read☕ Coffee break read

Imagine you are running a massive, high-speed factory that builds complex 3D puzzles. This factory is the brain of modern Artificial Intelligence (AI). Every day, it has to perform billions of tiny calculations called "multiplications" to figure out how to recognize a cat in a photo, translate a sentence, or drive a car.

The paper you shared is about redesigning the workers inside this factory to make them faster, cheaper, and much more energy-efficient, without ruining the final puzzle.

Here is the breakdown using simple analogies:

1. The Problem: The Exhausted Factory

In current AI computers (like the ones in your phone or Google's servers), the "workers" (called Processing Elements or PEs) are perfectionists. They calculate every single number with 100% exact precision.

  • The Issue: Being a perfectionist takes a lot of energy and space. It's like hiring a team of accountants who count every single grain of sand on a beach to measure the beach's size. It's accurate, but it's exhausting and slow.
  • The Result: These factories (chips) get hot, drain batteries quickly, and are too big to fit in small devices like smartwatches or drones.

2. The Solution: The "Good Enough" Workers

The authors of this paper proposed a new design for these workers. They introduced two types of workers:

  • The Exact Worker: Still perfect, but built with a smarter, more efficient blueprint.
  • The Approximate Worker: This is the star of the show. This worker is willing to make tiny, almost invisible mistakes to save huge amounts of energy.

The Analogy:
Imagine you are painting a picture of a sunset.

  • The Exact Worker measures the color of every single pixel to ensure the orange is exactly #FFA500.
  • The Approximate Worker looks at the orange and says, "That's close enough to sunset orange," and moves on.
  • The Catch: To the human eye, the painting looks identical. But the Approximate Worker used 68% less energy to finish the job!

3. The Secret Sauce: "Partial Product Cells"

How did they make these workers so efficient? They redesigned the tools the workers use.

  • Old Tools: The old tools were heavy and clunky. They had to do a lot of extra steps to handle negative numbers (like debts in a bank account).
  • New Tools (PPC & NPPC): The authors invented new, lightweight tools.
    • PPC (Positive Tool): Handles the "good" numbers efficiently.
    • NPPC (Negative Tool): Handles the "debt" numbers efficiently using a clever trick (like using a NAND gate, which is a simple logic switch).
  • The Result: These new tools are smaller, faster, and use less electricity. It's like swapping a heavy steam engine for a sleek electric motor.

4. The Proof: Does the Picture Still Look Good?

You might ask, "If they make mistakes, won't the AI fail?"
The authors tested this in three real-world scenarios:

  1. Compressing Photos (DCT): Imagine shrinking a photo to fit in an email.
    • Result: The photo looked almost perfect (45.97 dB quality). The "mistakes" were so small you couldn't see them.
  2. Finding Edges (Kernel Method): Imagine a robot trying to draw the outline of a cup.
    • Result: With a moderate amount of "sloppiness," the outline was still very clear (30.45 dB).
  3. Smart Edge Detection (CNN): This is the big test. Imagine a self-driving car trying to see the edge of a road.
    • Result: Amazingly, the system was super accurate (75.98 dB). Why? Because the AI network is smart enough to ignore the tiny errors made by the workers and still figure out the road perfectly.

5. Why This Matters

This paper is a game-changer for the future of technology because:

  • Battery Life: Your phone or drone could last much longer because the computer isn't wasting energy on "perfect" math that no one notices.
  • Size: These chips can be smaller, meaning we can put powerful AI into tiny devices (like hearing aids or smart glasses).
  • Speed: By doing things faster and with less heat, AI can run smoother.

The Bottom Line

The authors didn't just make the workers faster; they made them smarter about when to be perfect and when to be "good enough." They proved that in the world of AI, you don't need to be perfect to be brilliant. You just need to be efficient.

In short: They built a super-efficient factory that saves massive amounts of energy while still producing pictures and decisions that look and feel exactly the same to us humans.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →