Active Inference with a Self-Prior in the Mirror-Mark Task

The Big Question: How Do We Know "That's Me"?

Imagine you are looking in a mirror. You see a reflection. Suddenly, you notice a smudge of red paint on the reflection's forehead. You reach out and touch your own forehead to wipe it off.

This seems obvious to us, but for a computer or a robot, it's a massive puzzle. How does the machine know that the "face" in the mirror is actually its own face? How does it know that the red smudge is "wrong" and needs to be fixed, without anyone telling it to do so?

For decades, scientists have used the "Mirror Test" (putting a mark on an animal's face and seeing if they touch it) to check for self-awareness. Humans usually pass this test around age 2. Chimpanzees pass it too. But how do we build a machine that learns this on its own?

The Solution: A "Mental Memory" of Normalcy

The researchers built a simulated baby robot and taught it a simple trick: Don't look for the "mark." Look for the "weirdness."

They didn't program the robot with a rule like "If you see a red dot, touch your face." Instead, they gave the robot a Self-Prior.

The "Comfort Zone" Analogy

Think of the Self-Prior as a mental "Comfort Zone" or a familiar playlist.

Imagine you have a playlist of songs you've listened to every day for years. You know exactly how they sound.
One day, someone sneaks a loud, screeching noise into the middle of your favorite song.
You don't need a teacher to tell you, "That noise is bad." Your brain immediately screams, "That doesn't fit! That's not my song!"

In this study, the robot spent thousands of hours watching itself in the mirror and moving its arms. It built a "playlist" of what its own body looks and feels like when everything is normal (no stickers, no paint). This is its Self-Prior.

The Experiment: The Sticker Surprise

Once the robot had built its "Comfort Zone," the researchers stuck a sticker on its face.

The Glitch: When the robot looked in the mirror, it saw a sticker.
The Alarm: The robot's internal "playlist" said, "Wait a minute. I know what my face looks like. This sticker is a weird noise in my song. It doesn't fit my familiar pattern."
The Action: The robot didn't have a specific goal to "remove the sticker." Its only goal was to make the world feel familiar again (to get back to the "Comfort Zone").
The Result: Because the sticker was on its own face, the only way to make the "song" sound normal again was to reach out and wipe the sticker off.

How It Works (The Magic Sauce)

The researchers used a concept called Active Inference. Here is the simple version:

The Goal: The robot wants to minimize "Expected Free Energy."
Translation: The robot wants to reduce surprise.
The Process:
- The robot sees the sticker. Surprise level: High!
- The robot imagines different actions: "What if I move my hand left? What if I move right?"
- It simulates the future. If it moves its hand to the sticker, the sticker disappears, and the image in the mirror matches its "Comfort Zone" (Self-Prior). Surprise level: Low!
- So, it chooses that action.

It's like a thermostat. The thermostat doesn't "want" to be cold; it just wants to match the temperature you set. If the room gets too hot, the AC turns on to bring it back to the set point. The robot's "set point" is familiarity with itself.

The Results

The simulated baby robot succeeded in finding and removing the sticker about 70% of the time.

Crucially, the robot did this without:

Being told to remove the sticker.
Having a specific "sticker detector" programmed in.
Feeling the sticker with its fingers (it only used eyes and the sense of where its joints were).

The robot realized that the sticker was "non-self" simply because it didn't match the "self" it had learned to expect.

Why This Matters

This study suggests that self-awareness might not be a magical, complex thing. It might just be a simple statistical process:

Learn what "you" usually look and feel like.
Notice when something doesn't fit that pattern.
Act to fix the mismatch.

The "Self-Prior" acts like a probabilistic body map. It's not a rigid drawing of a body; it's a fuzzy, statistical guess of what your body usually is. When the mirror shows something that breaks that guess, the robot knows, "That's not me, or that's something wrong with me," and it acts to fix it.

The Bottom Line

This paper shows that you don't need a complex brain or a teacher to understand "self." You just need a system that learns its own habits and gets annoyed when things don't fit. By trying to make the world feel "normal" again, the robot accidentally discovered that it has a body, and that the reflection in the mirror is it.

It's a beautiful example of how curiosity and the desire for comfort can lead to the birth of self-awareness.

1. Problem Statement

The Mirror Self-Recognition (MSR) test is a standard behavioral indicator of self-awareness, where an agent touches a mark (e.g., a sticker) on its own body that is visible only via a mirror reflection.

The Gap: Existing computational models of MSR typically rely on explicit, hand-crafted modules to transform visual anomalies into motor commands or require external rewards to learn the behavior. They often lack an internal, intrinsic criterion that explains why an agent spontaneously acts to remove a mark without explicit instruction.
The Challenge: How can mark-directed behavior emerge solely from an agent's own experience and internal processing, without external rewards or explicit goal-setting, using only vision and proprioception?

2. Methodology

The authors propose a computational framework based on Active Inference and the Free Energy Principle, introducing a novel component called the Self-Prior.

A. Simulation Environment

Platform: An infant agent (EMFANT) built on the MuJoCo physics simulator.
Agent: A simulated infant with 12 Degrees of Freedom (DoF) in the neck, shoulders, and elbows.
Sensory Input:
- Vision: RGB images (64x64) from the right eye.
- Proprioception: Real-valued joint angles (12 dimensions).
- Constraint: No tactile input is provided to force the agent to rely on visual-proprioceptive integration.
Task: A sticker is randomly placed on the agent's face or upper torso. The agent must detect it in the mirror and remove it.

B. Model Architecture

The system integrates three main modules:

World Model (STORM-based):
- Uses a Variational Autoencoder (VAE) to compress visual and proprioceptive inputs into a discrete latent state ( $s_t$ ).
- Employs a Transformer (GPT-like) for temporal prediction of future latent states.
- Minimizes Variational Free Energy to learn a model of the environment.
Self-Prior ( $\tilde{p}(s)$ ):
- Implemented as a Transformer that models the probability density of the agent's familiar multisensory experiences in the latent space.
- Training: Trained exclusively on "sticker-free" states (95% of data). It learns the distribution of the agent's normal body appearance and joint configurations.
- Function: Acts as an internal criterion. When a novel state (e.g., a sticker) is observed, the likelihood under the self-prior drops, creating a "mismatch."
Policy Network:
- Selects actions to minimize Expected Free Energy (EFE).
- The EFE is decomposed into an ambiguity term (prediction error) and a prior preference term.
- Crucially, the "preferred" distribution is replaced by the Self-Prior. Therefore, the agent is intrinsically motivated to select actions that return the latent state to the high-density region of the self-prior (i.e., a sticker-free state).

C. Training Protocol

The training occurs in three progressive stages:

World Model Training: Learns to reconstruct observations and predict dynamics (Episodes 100+).
Self-Prior Training: Learns the density of sticker-free states (Episodes 120+).
Policy Training: Uses "imagination" (planning within the world model) to select actions that minimize EFE relative to the self-prior (Episodes 140+).

3. Key Contributions

Emergent Self-Recognition: Demonstrated that mark-directed behavior (removing a sticker) can emerge spontaneously from a single mechanism (Self-Prior + Active Inference) without external rewards or explicit "self vs. non-self" programming.
The Self-Prior Concept: Introduced a probabilistic body schema that learns the density of familiar multisensory experiences. It serves as an intrinsic motivation source, driving the agent to resolve mismatches between current sensation and the learned "self" distribution.
Cross-Modal Integration: Showed that the model learns the association between vision and proprioception, allowing the agent to locate a visual anomaly (sticker) using only its own body map (proprioception) to guide movement.
Unified Framework: Unified anomaly detection and action generation under the Free Energy Principle, moving away from modular, hand-crafted solutions.

4. Experimental Results

Success Rate: The simulated infant successfully removed the sticker in approximately 70% of evaluation episodes.
Expected Free Energy (EFE) Dynamics:
- Before learning, EFE remained high and unchanged regardless of the sticker.
- After learning, EFE was high when the sticker was present and significantly decreased (by ~12 units on average) immediately after the sticker was removed. This confirms the self-prior acts as an internal criterion for distinguishing self from non-self.
Learning Dynamics:
- The mean distance between the hand and the sticker decreased steadily throughout training, even before the success rate hit 50%, indicating the agent was learning to localize the anomaly.
- Failure cases were primarily due to occlusion (hand blocking the view) or kinematic limits, not a failure to recognize the mark.
Qualitative Analysis:
- Samples generated from the self-prior showed a "sticker-free" body.
- Reconstructing a "sticker-bearing" latent state through the self-prior resulted in a "sticker-free" visual output, proving the model has learned to prefer the familiar state.

5. Significance and Discussion

Developmental Origins of Self-Awareness: The study provides a computational hypothesis that self-awareness (specifically Level 3: Identification in Rochat's framework) can arise from the minimization of prediction error against a learned self-model, rather than requiring complex deductive reasoning or object permanence initially.
Inductive Theory Support: The results support the "inductive theory" of mirror self-recognition, which posits that self-recognition emerges from kinesthetic-visual matching and understanding mirror correspondence, rather than the "deductive theory" requiring full object permanence.
Intrinsic Motivation: The work distinguishes between "stimulus-elicited intention" (reacting to a mismatch) and "endogenous intention." The self-prior naturally generates stimulus-elicited intention, offering a bridge to how infants might transition to higher-order self-consciousness.
Limitations & Future Work:
- The study is currently limited to simulation; validation on real robots is needed.
- The model lacks tactile input, which is known to accelerate MSR in human infants.
- The success rate plateaued at ~70% due to physical reachability and visual occlusion, not cognitive failure.
- Future work aims to integrate tactile modalities and test generalization across different mirror angles and distances.

Conclusion

This paper successfully demonstrates that a Self-Prior, implemented via a Transformer within an Active Inference framework, can serve as a probabilistic body schema. By minimizing expected free energy relative to this prior, an agent spontaneously learns to recognize and remove a mark on its own body, offering a concise, reward-free computational explanation for the developmental origins of self-awareness.