Active Inference with a Self-Prior in the Mirror-Mark Task

This paper presents a computational model based on active inference and a Transformer-implemented self-prior that spontaneously enables a simulated infant to pass the mirror-mark test by detecting and removing a novel mark on its own face, thereby offering a unified free energy principle explanation for the developmental origins of self-awareness without external rewards.

Dongmin Kim, Hoshinori Kanazawa, Yasuo Kuniyoshi

Published 2026-04-14
📖 5 min read🧠 Deep dive

The Big Question: How Do We Know "That's Me"?

Imagine you are looking in a mirror. You see a reflection. Suddenly, you notice a smudge of red paint on the reflection's forehead. You reach out and touch your own forehead to wipe it off.

This seems obvious to us, but for a computer or a robot, it's a massive puzzle. How does the machine know that the "face" in the mirror is actually its own face? How does it know that the red smudge is "wrong" and needs to be fixed, without anyone telling it to do so?

For decades, scientists have used the "Mirror Test" (putting a mark on an animal's face and seeing if they touch it) to check for self-awareness. Humans usually pass this test around age 2. Chimpanzees pass it too. But how do we build a machine that learns this on its own?

The Solution: A "Mental Memory" of Normalcy

The researchers built a simulated baby robot and taught it a simple trick: Don't look for the "mark." Look for the "weirdness."

They didn't program the robot with a rule like "If you see a red dot, touch your face." Instead, they gave the robot a Self-Prior.

The "Comfort Zone" Analogy

Think of the Self-Prior as a mental "Comfort Zone" or a familiar playlist.

  • Imagine you have a playlist of songs you've listened to every day for years. You know exactly how they sound.
  • One day, someone sneaks a loud, screeching noise into the middle of your favorite song.
  • You don't need a teacher to tell you, "That noise is bad." Your brain immediately screams, "That doesn't fit! That's not my song!"

In this study, the robot spent thousands of hours watching itself in the mirror and moving its arms. It built a "playlist" of what its own body looks and feels like when everything is normal (no stickers, no paint). This is its Self-Prior.

The Experiment: The Sticker Surprise

Once the robot had built its "Comfort Zone," the researchers stuck a sticker on its face.

  1. The Glitch: When the robot looked in the mirror, it saw a sticker.
  2. The Alarm: The robot's internal "playlist" said, "Wait a minute. I know what my face looks like. This sticker is a weird noise in my song. It doesn't fit my familiar pattern."
  3. The Action: The robot didn't have a specific goal to "remove the sticker." Its only goal was to make the world feel familiar again (to get back to the "Comfort Zone").
  4. The Result: Because the sticker was on its own face, the only way to make the "song" sound normal again was to reach out and wipe the sticker off.

How It Works (The Magic Sauce)

The researchers used a concept called Active Inference. Here is the simple version:

  • The Goal: The robot wants to minimize "Expected Free Energy."
  • Translation: The robot wants to reduce surprise.
  • The Process:
    • The robot sees the sticker. Surprise level: High!
    • The robot imagines different actions: "What if I move my hand left? What if I move right?"
    • It simulates the future. If it moves its hand to the sticker, the sticker disappears, and the image in the mirror matches its "Comfort Zone" (Self-Prior). Surprise level: Low!
    • So, it chooses that action.

It's like a thermostat. The thermostat doesn't "want" to be cold; it just wants to match the temperature you set. If the room gets too hot, the AC turns on to bring it back to the set point. The robot's "set point" is familiarity with itself.

The Results

The simulated baby robot succeeded in finding and removing the sticker about 70% of the time.

Crucially, the robot did this without:

  • Being told to remove the sticker.
  • Having a specific "sticker detector" programmed in.
  • Feeling the sticker with its fingers (it only used eyes and the sense of where its joints were).

The robot realized that the sticker was "non-self" simply because it didn't match the "self" it had learned to expect.

Why This Matters

This study suggests that self-awareness might not be a magical, complex thing. It might just be a simple statistical process:

  1. Learn what "you" usually look and feel like.
  2. Notice when something doesn't fit that pattern.
  3. Act to fix the mismatch.

The "Self-Prior" acts like a probabilistic body map. It's not a rigid drawing of a body; it's a fuzzy, statistical guess of what your body usually is. When the mirror shows something that breaks that guess, the robot knows, "That's not me, or that's something wrong with me," and it acts to fix it.

The Bottom Line

This paper shows that you don't need a complex brain or a teacher to understand "self." You just need a system that learns its own habits and gets annoyed when things don't fit. By trying to make the world feel "normal" again, the robot accidentally discovered that it has a body, and that the reflection in the mirror is it.

It's a beautiful example of how curiosity and the desire for comfort can lead to the birth of self-awareness.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →