Go Beyond Your Means: Unlearning with Per-Sample Gradient Orthogonalization

🧠 The Big Problem: The "Forgetful" AI

Imagine you have a brilliant student (an AI model) who has read the entire internet. They are incredibly smart, but they've also memorized some things they shouldn't have:

Private photos of people who asked to be forgotten.
Copyrighted code snippets they stole from GitHub.
A specific person's voice that they shouldn't be able to recognize.

You want the student to forget these specific things. But here's the catch: You can't just erase their brain. If you try to scrub out the bad memories, you might accidentally wipe out their ability to do math, write poetry, or recognize other voices.

This is the challenge of Machine Unlearning: How do you make an AI forget specific data without ruining its general smarts?

🚫 The Old Way: The "Tug-of-War"

Most previous methods tried to fix this by playing a game of Tug-of-War.

Team Forget: They pull the model in one direction to make it forget the bad data (Gradient Ascent).
Team Remember: They pull the model in the opposite direction to keep it good at everything else (Gradient Descent).

The Flaw: This only works if you have a huge team of "Rememberers" (a massive dataset of the original training data) to balance out the "Forgetting."

The Reality: Often, the company that trained the AI doesn't have the original data anymore (maybe it was deleted, or it's too big to store). They only have a tiny scrap of data (a small "retain set") to help the model remember.
The Result: With a tiny team of Rememberers, the Tug-of-War fails. The model either forgets everything (including the good stuff) or remembers the bad stuff.

💡 The New Solution: OrthoGrad (The "Sideways Step")

The authors propose a new method called OrthoGrad. Instead of fighting the "Forget" force with a "Remember" force, they change the geometry of the problem.

The Analogy: The Dance Floor

Imagine the AI's knowledge is a giant dance floor.

The Bad Data (the thing to forget) is a group of dancers trying to pull the main dancer (the AI) toward the North.
The Good Data (the tiny scrap of retained data) is a small group trying to keep the dancer from moving too far East.

Old Method (Tug-of-War): The small group tries to pull East while the big group pulls North. The dancer gets stuck in the middle, or the small group gets dragged away.

OrthoGrad Method (The Sideways Step):
Instead of pulling East, the small group tells the dancer: "Don't worry about pulling us back. Just take a step North, but make sure you don't step East or West at all."

Mathematically, they project the "Forget" movement onto a path that is perfectly perpendicular (orthogonal) to the "Remember" movement.

They look at the tiny scrap of good data.
They calculate the exact direction that data cares about.
They force the "Forget" update to go in a direction that is 90 degrees to that.

Why this is magic: Because the update is perpendicular, it cannot accidentally mess up the good data. It's like walking down a hallway; you can walk forward (forgetting the bad thing) without bumping into the walls on your left and right (the good things).

🛠️ How It Works (The "Per-Sample" Secret Sauce)

The paper has a second clever trick.

Old methods looked at the "Average" of the good data. It's like asking a crowd, "What do you think?" and taking the middle answer. If the crowd is small, the average is shaky and unreliable.
OrthoGrad looks at every single person in that tiny crowd individually. It builds a "safety net" based on every single sample, not just the average.

The Analogy:
Imagine you are trying to walk through a forest without stepping on any flowers.

Average Method: You look at the forest from a helicopter, see a "general area" of flowers, and try to avoid that area. You might still step on a flower because the map was blurry.
OrthoGrad: You look at every single flower on the ground. You calculate a path that goes between every single one of them. Even if you only have 5 flowers to avoid, you can weave a perfect path through them without touching a petal.

🎤 Real-World Tests

The authors tested this on two very different things:

Speech Recognition (Whisper): They made the AI forget a specific person's voice. Even with very little data to "remember" the rest of the world, OrthoGrad made the AI stop recognizing that one person while still understanding everyone else perfectly.
Image Classification (ImageNet): They made the AI forget a whole category of images (like "dogs") or random pictures. OrthoGrad did a better job than all other methods at forgetting the target while keeping the rest of the brain sharp.

🏆 The Bottom Line

OrthoGrad is a new way to teach an AI to forget.

The Problem: You can't always retrain an AI from scratch, and you often don't have the original data to help it remember.
The Solution: Instead of fighting to keep the old knowledge, OrthoGrad takes a "sideways step." It updates the model in a direction that is mathematically guaranteed not to touch the knowledge you want to keep.
The Benefit: It works even when you have very little data to work with, making it perfect for real-world situations where privacy laws or data loss make the original training sets unavailable.

In short: It's the art of forgetting the bad stuff by walking a path that simply doesn't exist for the good stuff.

Here is a detailed technical summary of the paper "GO BEYOND YOUR MEANS: UNLEARNING WITH PER-SAMPLE GRADIENT ORTHOGONALIZATION" (OrthoGrad).

1. Problem Statement

Machine Unlearning aims to remove the influence of specific training data (the "unlearn set") from a pre-trained model without retraining from scratch, while preserving the model's performance on the remaining data (the "retain set").

The Core Challenge:
Most existing unlearning methods rely on balancing gradient ascent on the unlearn set (to degrade performance on that data) with gradient descent on the retain set (to preserve general performance). However, these methods typically assume access to the full original training dataset or a large retain set to compute an average gradient.

Real-world Constraint: In many practical scenarios (e.g., proprietary foundation models like Whisper or GitHub Copilot), the full training dataset is unavailable. Practitioners often only have access to a small, proxy retain set (e.g., a public dataset like LibriSpeech) that approximates the original distribution.
Failure Mode: When the retain set is small, averaging gradients leads to poor estimation of the "retain subspace." This causes significant interference between the unlearning and retention objectives, resulting in either incomplete forgetting or catastrophic forgetting (loss of general performance).

2. Methodology: OrthoGrad

The authors propose OrthoGrad, a novel approach that mitigates interference by enforcing per-sample gradient orthogonality rather than relying on batch-level averages.

A. Geometric Motivation

The ideal unlearning objective is to modify the model parameters such that the loss on the unlearn set increases, while the loss on the retain set remains constant.

Manifold Constraint: The authors frame this as an optimization problem constrained to a manifold where the retain set loss is invariant.
Tangent Space: The valid update directions lie in the tangent space of this manifold, which is mathematically equivalent to the null space of the Jacobian of the retain losses.
Key Insight: Instead of projecting the unlearn gradient onto the space orthogonal to the average retain gradient (as done in prior work like GDR-GMA), OrthoGrad projects the unlearn gradient onto the space orthogonal to the span of all individual per-sample gradients in the retain batch. This creates a stricter constraint that better preserves the specific information of the small retain set.

B. Algorithm Steps

Per-Sample Gradient Computation: For a batch of the retain set ( $B_r$ ), compute the gradient for each individual sample ( $g^1_r, g^2_r, \dots, g^k_r$ ).
Subspace Construction: Perform QR decomposition on the matrix of these per-sample gradients to extract an orthonormal basis ( $Q$ ) spanning the retain gradient subspace.
Orthogonal Projection: Compute the gradient for the unlearn batch ( $g_u$ ). Project $g_u$ onto the subspace spanned by $Q$ and subtract this projection to obtain the orthogonalized gradient ( $g^\perp_u$ ).
$g^\perp_u = g_u - \sum \langle g_u, q_i \rangle q_i$
Unified Update: Combine the orthogonalized unlearn gradient with the average retain gradient to balance forgetting and retention:
$g_{update} = \alpha \bar{g}_r - (1-\alpha) g^\perp_u$
Where $\alpha$ is a hyperparameter controlling the trade-off.
Parameter-Efficient Fine-Tuning (PEFT): To further minimize interference with the original model weights, the method applies updates only to LoRA (Low-Rank Adaptation) modules attached to the pre-trained model.

3. Key Contributions

Novel Algorithm (OrthoGrad): A machine unlearning method specifically designed for low-data regimes where the retain set is small and the original training data is unavailable.
Per-Sample Perspective: Theoretical and empirical demonstration that projecting against per-sample gradients (rather than averaged gradients) significantly reduces interference and improves stability, especially when the retain set is small.
Theoretical Foundation: Provided a geometric derivation showing that the ideal unlearning trajectory corresponds to gradient flow restricted to the manifold of parameters that leave the retain set loss unchanged, justifying the orthogonal projection approach.
Comprehensive Evaluation: Validated across multiple modalities (Image Classification, Automatic Speech Recognition) and datasets (ImageNet, CIFAR-10, LibriSpeech), demonstrating robustness in random forgetting, class forgetting, and proxy-retain settings.

4. Experimental Results

The authors evaluated OrthoGrad against state-of-the-art baselines (NegGrad+, SCRUB, GDR-GMA, DUCK, SCAR, SSD, etc.) using the Unlearning Impact Score (UIS), which balances forgetting performance and retained accuracy.

Automatic Speech Recognition (ASR):
- Setup: Unlearning specific speakers from Whisper-Tiny using a small retain set (LibriSpeech).
- Result: OrthoGrad achieved a Word Error Rate (WER) on the unlearned speaker of 98.53% (effectively forgetting) while maintaining a test WER of 13.98%.
- Comparison: Outperformed GDR-GMA (Test WER 32.52%) and NegGrad+ (Test WER 85.90%). OrthoGrad showed significantly lower variance and higher stability.
- Ablation: Per-sample orthogonalization reduced test WER variance by an order of magnitude compared to mean-based orthogonalization.
Image Classification (ImageNet & CIFAR-10):
- Setup: Random sampling and class-wise forgetting with limited retain sets (10K samples vs. full training set).
- Result: OrthoGrad consistently achieved the lowest UIS scores.
  - ImageNet (ResNet-18): UIS of 0.016 (Random) and 0.021 (Class) vs. GDR-GMA (0.037/0.043).
  - ImageNet (ViT): UIS of 0.018 (Random) and 0.002 (Class).
- Robustness: The method remained effective even when the retain set size was reduced to 1K samples, whereas methods like SCAR failed due to matrix inversion issues.
Proxy-Retain Setting:
- In scenarios where the retain set comes from a different distribution (CINIC-10), OrthoGrad successfully unlearned data while maintaining generalization, whereas baselines either failed to forget or suffered catastrophic forgetting.

5. Significance and Impact

Practical Applicability: OrthoGrad addresses a critical gap in real-world AI deployment. It enables unlearning for foundation models where the original training data is proprietary, copyrighted, or simply inaccessible, relying only on small, public proxy datasets.
Efficiency: By combining orthogonalization with LoRA, the method reduces computational overhead and memory usage compared to full-parameter fine-tuning, making it scalable for large models.
Theoretical Advancement: The paper shifts the paradigm from "balancing averages" to "enforcing per-sample constraints," offering a more rigorous geometric solution to the gradient conflict problem in unlearning.
Safety & Privacy: The ability to effectively "forget" specific users (e.g., in ASR) or copyrighted content (e.g., in image generation) without retraining is crucial for regulatory compliance (GDPR "Right to be Forgotten") and ethical AI deployment.

In summary, OrthoGrad provides a robust, mathematically grounded solution for machine unlearning in data-constrained environments, outperforming existing methods by leveraging per-sample gradient orthogonality to minimize interference between forgetting and retention.