STAMP: Selective Task-Aware Mechanism for Text Privacy

Imagine you are sending a very important letter to a friend, but you have to hand it to a suspicious courier (the AI model) to deliver it. You want your friend to understand the message perfectly, but you don't want the courier to see your secret address, your bank account number, or your mother's maiden name.

Traditionally, people tried to solve this by scrubbing the whole letter. They would take a giant eraser and blur out every single word equally.

The Problem: If you blur out the word "Einstein" in a question about physics, your friend can't answer the question. If you blur out "apple" in a recipe, the recipe is ruined. You lose the utility (usefulness) of the letter just to protect the secrets.

This paper introduces STAMP, a smarter way to handle this. Think of STAMP as a High-Tech, Selective Redaction Pen that knows exactly what to hide and what to keep clear.

Here is how it works, broken down into three simple concepts:

1. The "Traffic Light" System (Selective Budgeting)

Instead of treating every word the same, STAMP looks at every word in your sentence and asks two questions:

Is this word a secret? (e.g., "John Smith," "Credit Card #1234")
Is this word important for the task? (e.g., "Einstein" for a physics question, "Delicious" for a food review).

It then sorts words into four groups, like traffic lights:

🔴 Red (Secret + Unimportant): These are sensitive words that don't help the task (like a name in a weather report). STAMP gives these the maximum protection. It scrambles them heavily so the courier can't guess them at all.
🟢 Green (Not Secret + Important): These are the words your friend needs to understand the message (like "rain" in a weather report). STAMP gives these almost no protection. They stay clear and crisp.
🟡 Yellow (Secret + Important): These are tricky (like a name that is also the answer to a riddle). STAMP has to balance them, giving them a "medium" amount of scrambling.
⚪ White (Not Secret + Unimportant): Words like "the," "and," or "very." These get a little bit of scrambling, but not much.

The Analogy: Imagine you are packing a suitcase for a trip. You don't wrap your entire suitcase in bubble wrap. You wrap your fragile, expensive vase (the secret) in thick bubble wrap, but you leave your t-shirt (the important info) loose so it's easy to grab. STAMP does exactly this with words.

2. The "Spinning Top" Trick (The Polar Mechanism)

Once STAMP decides how much to scramble a word, it has to actually change the word without making it look like gibberish.

Most old methods tried to scramble words by adding "static noise" (like turning up the volume on a radio until it's just static). This often breaks the meaning.

STAMP uses a clever geometric trick called the Polar Mechanism.

The Analogy: Imagine every word is a spinning top standing on a table. The top has a height (how strong the word is) and a direction it is pointing (what the word means).
The Magic: STAMP only spins the top to change its direction. It leaves the height exactly the same.
Why this helps: In the world of AI, the "direction" of a word is what gives it meaning. By only spinning the direction slightly, the word stays in the same "neighborhood" of meaning. "Cat" might spin slightly to become "Kitten" or "Feline," but it won't accidentally turn into "Banana." This keeps the sentence readable while still hiding the exact original word.

3. The Result: A Better Trade-Off

The paper tested this on three different tasks:

Answering Questions (SQuAD): Can the AI answer "Who developed relativity?" even if the name "Einstein" is hidden? Yes, because STAMP kept the context words clear.
Sentiment Analysis (Yelp): Can the AI tell if a restaurant review is positive or negative? Yes, because the words describing the food weren't scrambled.
News Classification (AG News): Can the AI tell if an article is about Sports or Politics? Yes.

The Bottom Line:
Old methods were like putting a blindfold on the whole team. STAMP is like putting a blindfold only on the players who are holding the secrets, while letting the players who need to see the ball keep their eyes open.

This allows you to send your data to the cloud (or an AI) with stronger privacy for your secrets, but much better performance for the task you actually want to do. It's the best of both worlds: you get your privacy and your utility.

Here is a detailed technical summary of the paper "STAMP: Selective Task-Aware Mechanism for Text Privacy."

1. Problem Statement

Modern Large Language Models (LLMs) often process user inputs containing sensitive information (e.g., PII, names, dates). While Local Differential Privacy (LDP) is a standard framework for protecting user data by randomizing inputs locally before transmission, existing text privatization methods suffer from a poor privacy-utility trade-off:

Uniform Noise: Classical approaches (e.g., randomized response or adding isotropic Gaussian/Laplace noise to embeddings) apply the same level of noise to all tokens. This degrades utility by distorting semantically crucial tokens while wasting privacy budgets on irrelevant tokens (e.g., stop words).
Task Agnosticism: Prior selective methods often rely on static linguistic heuristics (like part-of-speech tags) rather than the specific downstream task context. A token might be critical for one query (e.g., "Einstein" for a physics question) but irrelevant for another, yet static methods cannot adapt to this dynamic importance.
Geometric Mismatch: Adding isotropic noise to embedding vectors often disrupts the semantic structure of the embedding space. Furthermore, decoding perturbed embeddings using mismatched rules can lead to incoherent text.

2. Methodology: The STAMP Framework

STAMP (Selective Task-Aware Mechanism for Text Privacy) addresses these issues through a two-pronged approach: Selective Budget Allocation and Geometry-Aligned Perturbation.

A. Selective, Task-Aware Budget Allocation

STAMP partitions tokens into four distinct groups based on two binary dimensions:

Privacy Sensitivity: Determined by Named Entity Recognition (NER) or PII detection (e.g., names, locations, IDs).
Task Importance: Determined dynamically by the cosine similarity between a token's embedding and a task-specific or query-specific representation.

This creates four groups:

Group 1: High Sensitivity + High Importance (Moderate budget).
Group 2: High Sensitivity + Low Importance (Strongest protection/Smallest budget).
Group 3: Low Sensitivity + High Importance (Weakest protection/Largest budget to preserve utility).
Group 4: Low Sensitivity + Moderate Importance.

The framework assigns specific privacy budgets ( $\epsilon$ ) to each group, ensuring that noise is concentrated on sensitive but task-irrelevant tokens, while preserving the integrity of task-critical information.

B. The Polar Mechanism (Geometry-Aligned Perturbation)

Instead of adding noise in the Euclidean space ( $\mathbb{R}^d$ ), STAMP introduces the Polar Mechanism, which operates on the unit sphere:

Decomposition: Token embeddings are decomposed into magnitude (radius) and direction (unit vector).
Perturbation:
- Direction: Perturbed using von Mises-Fisher (vMF) noise on the unit sphere. This preserves the semantic neighborhood structure better than isotropic noise.
- Magnitude: The mechanism utilizes a Normalized Polar approach where the magnitude is discarded (set to 1) entirely. This is justified by the "radial invariance" property: since downstream decoding relies on cosine similarity (direction), the magnitude carries no semantic information for the task but correlates with token frequency.
Decoding: The privatized embedding is decoded via Cosine Nearest-Neighbor Search. Because the perturbation and decoding both operate on angular geometry, the semantic relationships are preserved more effectively than with isotropic noise.

C. Privacy Guarantees

STAMP satisfies Task-Aware Metric LDP. It provides formal guarantees where the privacy budget is allocated per group, and the indistinguishability is scaled by the metric distance between tokens within those groups.

3. Key Contributions

Selective Task-Aware Allocation: A novel framework that dynamically allocates privacy budgets based on both token sensitivity and task relevance, moving beyond static linguistic heuristics.
The Polar Mechanism: A geometry-aware perturbation method that privatizes only the direction of embeddings on the unit sphere while preserving magnitude (or discarding it), aligning the perturbation geometry with the cosine-based decoding geometry.
Comprehensive Evaluation: Extensive experiments demonstrating that STAMP consistently outperforms uniform budget allocation and isotropic noise mechanisms across diverse NLP tasks.

4. Experimental Results

The authors evaluated STAMP on three datasets: SQuAD (Question Answering), Yelp (Sentiment Analysis), and AG News (Topic Classification).

Polar vs. Laplace: Under matched privacy budgets, the Polar mechanism (vMF noise) significantly outperformed the isotropic Laplace mechanism. While Laplace performance collapsed toward random chance at lower budgets, Polar maintained high utility and approached the non-private baseline as budgets increased.
STAMP vs. Uniform: STAMP consistently achieved superior privacy-utility trade-offs compared to a uniform budget allocation scheme.
- In SQuAD, STAMP preserved the ability to answer questions correctly by protecting task-relevant tokens while masking sensitive entities.
- In Yelp and AG News, STAMP maintained higher classification accuracy by minimizing noise on sentiment-bearing or topic-defining words.
Computational Overhead: The overhead of the task-aware grouping and vMF sampling was negligible (approx. 195ms/token vs. 192ms/token for the baseline), making it practical for real-world deployment.

5. Significance

STAMP represents a paradigm shift in text privacy by treating privacy not as a uniform property of text, but as a contextual choice.

Efficiency: It maximizes the utility of a fixed privacy budget by intelligently distributing noise where it causes the least harm to the task.
Semantic Preservation: By aligning the perturbation geometry (angular) with the decoding geometry (cosine similarity), it preserves the semantic structure of language better than traditional Euclidean noise addition.
Practicality: The framework is modular and computationally efficient, offering a viable solution for deploying LLMs on sensitive data (e.g., healthcare, finance) without compromising model performance.

In conclusion, STAMP demonstrates that selective, task-aware privatization combined with geometry-aligned noise is a superior strategy for balancing privacy and utility in modern NLP systems.

STAMP: Selective Task-Aware Mechanism for Text Privacy

1. The "Traffic Light" System (Selective Budgeting)

2. The "Spinning Top" Trick (The Polar Mechanism)

3. The Result: A Better Trade-Off

1. Problem Statement

2. Methodology: The STAMP Framework

A. Selective, Task-Aware Budget Allocation

B. The Polar Mechanism (Geometry-Aligned Perturbation)

C. Privacy Guarantees

3. Key Contributions

4. Experimental Results

5. Significance

More like this

Equitable Multi-Task Learning for AI-RANs

SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

The Temporal Markov Transition Field

SoftJAX & SoftTorch: Empowering Automatic Differentiation Libraries with Informative Gradients

Expressivity-Efficiency Tradeoffs for Hybrid Sequence Models