GOMA: Geometrically Optimal Mapping via Analytical Modeling for Spatial Accelerators

This paper presents GOMA, a geometrically abstracted, globally optimal framework that uses analytical modeling to efficiently solve the combinatorial GEMM mapping problem for spatial accelerators, achieving significant improvements in energy-delay product and search speed over state-of-the-art methods.

Wulve Yang, Hailong Zou, Rui Zhou, Jionghao Zhang, Qiang Li, Gang Li, Yi Zhan, Shushan Qiao

Published Tue, 10 Ma
📖 4 min read☕ Coffee break read

Imagine you are the manager of a massive, high-tech factory (a Spatial Accelerator) tasked with assembling millions of tiny Lego structures (performing Matrix Multiplications, the math behind AI).

Your factory has different storage rooms: a giant warehouse outside (DRAM), a large storage room inside (SRAM), workbenches (PE Arrays), and small toolboxes right next to the workers' hands (Registers).

The goal is to build the structures as fast and energy-efficiently as possible.

The Problem: The "Choice" Nightmare

To get the job done, you have to decide:

  1. How to pack the Lego bricks: Do you bring in a huge box of bricks at once, or small handfuls? (This is Tiling).
  2. The order of operations: Do you build the left side first, then the right? Or do you build the top, then the bottom? (This is Loop Permutation).
  3. What to keep on the workbench: Do you keep the bricks in the toolbox, or just grab them from the warehouse every single time you need one? (This is Bypass).

The problem is that there are trillions of possible ways to arrange these choices. It's like trying to find the perfect outfit by trying on every single combination of shirts, pants, and shoes in the world. If you try them all one by one, you'll be old before you finish. If you just guess (which most current AI tools do), you might get a decent outfit, but you'll miss the perfect one that saves the most energy.

The Solution: GOMA (The "Geometric GPS")

The paper introduces GOMA, a new tool that acts like a GPS for your factory. Instead of guessing or trying every path, GOMA uses a clever trick called Geometric Abstraction.

1. The "Shadow" Analogy (Geometric Abstraction)

Imagine your Lego structure is a 3D cube. GOMA doesn't look at the cube itself; it looks at the shadows the cube casts on the walls.

  • One shadow shows the "Left/Right" view.
  • One shows the "Front/Back" view.
  • One shows the "Top/Bottom" view.

GOMA realizes that every time you move a step in the factory, you only change two of these shadows, while one stays the same.

  • If you move "Forward," the "Top" shadow doesn't change, so you don't need to fetch new "Top" bricks. You reuse what you have!
  • If you move "Sideways," the "Front" shadow stays the same, so you reuse those bricks.

By tracking these shadows, GOMA can instantly calculate exactly how many bricks need to be moved between storage rooms. It turns a complex, messy puzzle into a simple math equation that can be solved in a split second.

2. The "Shortcut" (Analytical Modeling)

Most other tools try to simulate the factory running to see how much energy it uses. This is like actually building the Lego set to see how long it takes. It's slow.

GOMA, however, has a magic formula. Because it understands the geometry of the shadows, it can predict the energy cost of any arrangement instantly (in O(1) time, which means it takes the same tiny amount of time whether you have 10 bricks or 10 billion).

3. The "Perfect Solution" (Global Optimality)

Because GOMA has this instant formula, it can use a powerful math solver to look at all the possibilities at once and find the single best arrangement.

  • Old tools: "I'll try 1,000 random outfits and pick the best one I found." (Might miss the perfect one).
  • GOMA: "I have a map of the entire world. I can prove mathematically that this specific outfit is the absolute best possible one."

Why Does This Matter?

The researchers tested GOMA on modern AI models (like the ones powering chatbots and image generators).

  • Energy Savings: GOMA found ways to run these models 2 to 4 times more efficiently than the current best tools. That means your phone battery lasts longer, or your data center uses less electricity.
  • Speed: GOMA figured out the best plan 4 to 70 times faster than the competition. It didn't waste time guessing; it just calculated the answer.

The Takeaway

Think of GOMA as the difference between a tourist wandering around a city hoping to find the best restaurant, and a local who has a perfect map and knows exactly which restaurant is the best, how to get there, and how much it costs, without ever having to walk the whole city.

It takes a chaotic, overwhelming problem (choosing how to run AI on chips) and turns it into a clean, solvable geometry puzzle, guaranteeing the most efficient result every time.