Erase at the Core: Representation Unlearning for Machine Unlearning

The paper introduces Erase at the Core (EC), a model-agnostic framework that enforces comprehensive machine unlearning by applying multi-layer contrastive learning and deep supervision to eliminate superficial forgetting and substantially reduce representational similarity across the entire network hierarchy while preserving performance on retained data.

Jaewon Lee, Yongwoo Kim, Donghyun Kim

Published 2026-03-02
📖 4 min read☕ Coffee break read

The Problem: The "Superficial Amnesia"

Imagine you hire a chef (the AI model) to cook a massive banquet using recipes from 1,000 different cultures. One day, a customer says, "I want to forget about the Italian recipes. Please remove all knowledge of pasta, pizza, and lasagna from your mind."

Most current methods for "unlearning" are like Superficial Amnesia.

  • What they do: They tell the chef, "If someone asks for pasta, just say 'I don't know' or give them a random soup."
  • The Result: The chef looks like they forgot. If you ask them directly, they fail the test.
  • The Catch: If you peek inside the chef's brain (the internal features), they still have the Italian recipes written on sticky notes in every drawer. They haven't actually deleted the knowledge; they just learned to hide it. If you give them a new prompt or a slightly different question, they can easily pull those Italian recipes back out.

The authors call this "Superficial Forgetting." The model passes the test, but the information is still lurking in the background, waiting to be recovered.

The Solution: "Erase at the Core" (EC)

The authors propose a new method called Erase at the Core (EC). Instead of just telling the chef to hide the answer, they want to burn the recipes from the inside out.

Here is how EC works, using a few analogies:

1. The Multi-Layer Scrubbing (The "Deep Clean")

Think of the AI model as a multi-story building.

  • The Old Way: Most unlearning methods only clean the lobby (the final output layer). They wipe the sign that says "Italian Food" off the door. But the kitchens on the 2nd, 3rd, and 4th floors are still full of Italian ingredients.
  • The EC Way: EC sends a cleaning crew to every single floor of the building. They go to the basement, the middle floors, and the top floor. They scrub the walls, wash the floors, and throw out the ingredients at every level. This ensures that the "Italian" concept is erased from the foundation up to the roof.

2. The "Confusion" Strategy (Contrastive Unlearning)

How do they actually erase the memory?

  • Imagine the "Italian" recipes are stored in a specific, neat row of filing cabinets.
  • EC takes those files and smashes them up, then scatters the pieces into the cabinets containing "French" and "Mexican" recipes.
  • It mixes the "forget" data so thoroughly with the "keep" data that the AI can no longer tell where the Italian recipes end and the others begin. The distinct shape of the Italian knowledge is dissolved.

3. The "Guardian" (Deep Supervision)

You might worry: "If I mix everything up, won't the chef forget how to cook anything?"

  • EC has a safety net. While it is smashing the Italian files, it has a Guardian watching the "French" and "Mexican" recipes.
  • The Guardian ensures that while the Italian files are being destroyed, the French and Mexican files remain perfectly organized and easy to find. This ensures the chef stays good at cooking the dishes they are supposed to keep.

Why This Matters

The paper shows that previous methods were like putting a blindfold on the chef. The chef couldn't say the Italian words, but they could still think them.

Erase at the Core removes the blindfold and actually deletes the thoughts.

  • Better Privacy: It makes it much harder for hackers to trick the AI into revealing the "forbidden" data (a technique called a "linear probing attack").
  • True Compliance: It actually fulfills the "Right to be Forgotten" laws (like GDPR) by ensuring the data is gone, not just hidden.
  • Plug-and-Play: The cool part is that EC isn't a whole new kitchen; it's a plug-in module. You can take any existing unlearning method and attach EC to it, instantly making it much better at actually deleting information.

The Bottom Line

If you want to truly forget something, you can't just stop talking about it. You have to rewire your brain so the memory doesn't exist anymore. Erase at the Core is the tool that does exactly that for AI, scrubbing the memory clean from the bottom up, ensuring that once data is deleted, it's really, truly gone.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →