Symbol-Equivariant Recurrent Reasoning Models

The paper introduces Symbol-Equivariant Recurrent Reasoning Models (SE-RRMs), which enforce permutation equivariance at the architectural level to significantly improve the robustness, scalability, and generalization of neural reasoning on tasks like Sudoku and ARC-AGI compared to prior models.

Richard Freinschlag, Timo Bertram, Erich Kobler, Andreas Mayr, Günter Klambauer

Published 2026-03-03
📖 4 min read☕ Coffee break read

Imagine you are trying to teach a robot how to solve a Sudoku puzzle.

The puzzle has a simple rule: every row, column, and 3x3 box must contain the numbers 1 through 9 exactly once. But here's the catch: the actual numbers don't matter. If you replaced every "1" with a "Red Apple," every "2" with a "Blue Banana," and every "3" with a "Green Grape," the logic of the puzzle would remain exactly the same. The robot should be able to solve it regardless of what symbols you use.

The Problem: The Robot is "Color-Blind" to Logic

Previous AI models (called RRMs) were like students who memorized the specific answers to a practice test. If you gave them a test with the numbers 1–9, they did great. But if you gave them a test with 1–16 (a bigger Sudoku) or swapped the numbers for colors, they got confused.

To fix this, researchers used a "brute force" method: they showed the robot thousands of puzzles where the numbers were randomly swapped (data augmentation). It was like forcing the student to memorize every possible variation of the test. It worked, but it was slow, expensive, and the robot still couldn't handle puzzles it had never seen before.

The Solution: The "Universal Translator" (SE-RRM)

The authors of this paper introduced a new model called SE-RRM (Symbol-Equivariant Recurrent Reasoning Model).

Think of the old models as a chef who only knows how to cook with specific ingredients (e.g., "I only know how to use this specific brand of tomato"). If you give them a different tomato, they panic.

The new SE-RRM is like a chef who understands the concept of a tomato. They know that a tomato is a "red, round, acidic fruit" regardless of the brand.

  • The Magic Trick: Instead of memorizing that "Red = 1" and "Blue = 2," the SE-RRM is built with a special architectural rule: "If you swap the labels, the logic stays the same."
  • It treats the symbols (numbers, colors, shapes) as interchangeable tokens. It doesn't care if the puzzle uses numbers 1–9, colors, or emojis. It just cares about the relationships between them.

How It Works: The "3D Puzzle"

In the old models, the AI looked at the puzzle as a flat 2D grid (rows and columns).
In the new SE-RRM, the AI adds a third dimension. Imagine the puzzle isn't just a flat sheet of paper, but a stack of transparent sheets.

  • Sheet 1: The positions (where the numbers go).
  • Sheet 2: The symbols (what the numbers are).
  • The AI looks at the puzzle in 3D, allowing it to see that "Position A" and "Position B" are related, and "Symbol X" and "Symbol Y" are related, without getting confused by what the symbols actually are.

The Results: Why This Matters

The researchers tested this new model on three types of challenges:

  1. Sudoku (The Logic Test):

    • Old Models: Could solve standard 9x9 puzzles but failed miserably when asked to solve a 4x4 (smaller) or 16x16 (bigger) puzzle. They couldn't "extrapolate" (guess the rules for new sizes).
    • SE-RRM: Solved the 9x9 puzzles better than anyone else. More impressively, it successfully solved 4x4 puzzles it had never seen before and made decent guesses on 16x16 and 25x25 puzzles. It learned the rules, not just the answers.
  2. ARC-AGI (The "Human Intelligence" Test):

    • These are puzzles that test if a machine can think like a human (e.g., "If I move this shape, what happens?").
    • The Win: The SE-RRM achieved top-tier results using only 8 variations of the puzzle for training. The old models needed 1,000 variations to get similar results. It's the difference between learning a language by reading one book vs. reading a thousand different books.
  3. Mazes (The Planning Test):

    • Even when the "symbol swap" trick wasn't strictly necessary (because walls aren't the same as exits), the new model still performed better, proving its architecture is just generally smarter.

The Big Picture

This paper is a breakthrough because it stops AI from being a "parrot" that just repeats what it memorized. Instead, it builds a "reasoner" that understands the underlying structure of a problem.

  • Efficiency: It needs way less data to learn.
  • Scalability: It can handle bigger, stranger problems without needing to be retrained.
  • Robustness: It doesn't break when you change the "colors" of the problem.

In short, the authors built a robot that doesn't just memorize the map; it understands the concept of navigation.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →