Missing-by-Design: Certifiable Modality Deletion for Revocable Multimodal Sentiment Analysis

The paper introduces Missing-by-Design (MBD), a unified framework for revocable multimodal sentiment analysis that combines structured representation learning with a certifiable parameter-modification pipeline to enable the machine-verifiable deletion of specific data modalities while maintaining predictive performance and privacy compliance.

Rong Fu, Ziming Wang, Chunlei Meng, Jiaxuan Lu, Jiekai Wu, Kangan Qian, Hao Zhang, Simon Fong

Published Wed, 11 Ma
📖 4 min read☕ Coffee break read

Imagine you have a very smart, all-knowing assistant who helps you analyze how people feel based on three things they say: what they write (text), how they sound (audio), and what their face looks like (video). This assistant is great at guessing emotions, but it has a problem: it memorizes everything, including private details you might not want it to keep forever.

What if you wanted to say, "Hey, please forget everything you learned from my voice, but keep the rest of your knowledge about my text and face"? Usually, to do this, you'd have to fire the assistant and hire a whole new team to train from scratch. That's expensive, slow, and wasteful.

This paper introduces a new system called Missing-by-Design (MBD). Think of it as a "surgical eraser" that can remove specific memories from your AI assistant without firing the whole team.

Here is how it works, broken down into simple concepts:

1. The "Universal Translator" vs. The "Specialist"

Most AI models mix all their information together in a big blender. If you want to remove the "audio" flavor, you accidentally ruin the "text" flavor too.

MBD's Secret Sauce: It separates the information into two buckets:

  • The "Sample-Specific" Bucket: This holds the unique details of this specific sentence (e.g., "I am angry right now").
  • The "Property" Bucket: This holds the general rules of how that type of data works (e.g., "How human voices generally express sadness").

By keeping these separate, the AI can say, "I know how to understand voices in general (Property), but I will forget the specific voice patterns from this user."

2. The "Magic Reconstructor"

What happens if the AI forgets the audio? Won't it be confused?
MBD has a built-in Magic Reconstructor. Imagine a painter who has seen thousands of faces. If you cover up the eyes in a photo, the painter can guess what the eyes probably looked like based on the rest of the face.

  • If the audio is missing, the AI uses the text and video to "hallucinate" (reconstruct) what the audio would have sounded like.
  • This ensures the AI stays smart and accurate even when parts of the data are missing or deleted.

3. The "Surgical Scalpel" (The Deletion)

When you ask to delete the audio, MBD doesn't just smash the audio part of the brain. It performs surgery:

  1. Identify the Culprits: It looks at the neural network and finds the specific "wires" (parameters) that are most responsible for remembering audio.
  2. The Cut: It carefully cuts or dampens those specific wires.
  3. The Noise Injection: To make sure the AI truly "forgets" and doesn't just pretend to, it adds a tiny, controlled amount of static (noise) to those wires. This is like scrambling a specific page in a book so the story is gone, but the rest of the book remains readable.

4. The "Certificate of Deletion" (The Receipt)

This is the coolest part. In the real world, if you ask a company to delete your data, how do you know they actually did it? They could just say "Okay" and keep it.

MBD generates a Modality Deletion Certificate (MDC).

  • Think of this as a digital receipt or a notarized deed.
  • It is a machine-readable document that proves, mathematically, that the specific audio data has been removed.
  • It lists exactly which wires were cut, how much noise was added, and provides a "mathematical guarantee" that the AI can no longer distinguish your specific audio from random noise. You can verify this certificate yourself!

Why is this a big deal?

  • Privacy: You can now demand that an AI forgets your voice or face without losing its ability to understand your text.
  • Efficiency: Instead of retraining the whole AI (which takes days and costs a fortune), this "surgery" takes seconds.
  • Trust: The certificate gives you proof, not just a promise.

In a nutshell: MBD is like having a librarian who can take a specific book out of a library, shred the pages you don't want, and give you a notarized receipt proving the pages are gone, all while keeping the rest of the library perfectly organized and open for business.