Nearest-Neighbor Density Estimation for Dependency Suppression

The Big Idea: The "Privacy Blender"

Imagine you have a giant jar of smoothies. Each smoothie is made of fruit (the useful information you want to keep) and a specific type of leaf (a sensitive piece of information you want to remove, like a person's gender or a medical device in an X-ray).

Currently, if you try to drink the smoothie, you can't help but taste the leaf. If you try to pick the leaf out with your fingers, you might accidentally throw away some of the fruit, too.

This paper proposes a new "Smart Blender." It doesn't just try to pick the leaf out; it completely re-mixes the smoothie so that the leaf is still there, but it's so thoroughly blended that you can't taste it at all, while the fruit flavor remains perfectly intact.

The Problem: Hidden Biases

In the world of data (like photos or medical records), there are often "hidden biases."

Example: In a dataset of photos, maybe every photo of a "smiling" person happens to have a "square" background, and every "frowning" person has a "circle" background.
The Risk: If you train an AI to recognize smiles, it might cheat. Instead of learning what a smile looks like, it just learns to look for square backgrounds. This is bad because if you show it a photo with a circle background, it gets confused.

The goal of this paper is to teach the AI to ignore the "square vs. circle" background (the sensitive variable) while still remembering what a smile looks like.

The Old Ways vs. The New Way

1. The "Adversarial" Approach (The Cat and Mouse Game)
Old methods try to train two AIs against each other. One AI tries to hide the secret (the leaf), and the other tries to find it.

The Flaw: It's like a game of hide-and-seek. The "hider" only learns to hide from that specific seeker. If you bring in a new, smarter seeker, the hider gets caught. It's unreliable.

2. The New Approach: "Nearest-Neighbor Density Estimation"
The authors (Anderson and Martinetz) took a different path. Instead of playing a game, they decided to measure the crowd.

Imagine a crowded room where people are standing based on their height and weight.

The Goal: We want to shuffle the people so that "tall people" and "short people" are mixed up randomly, but we don't want to mess up their "favorite color" (the useful data).
The Trick: They use a rule called "Nearest-Neighbor Density."
- If you stand in a spot where there are many people very close to you, that spot is "crowded" (high density).
- If you stand in a spot where the nearest person is far away, that spot is "empty" (low density).
- The new method calculates: "If I move this person, does the crowd density around them change based on their secret (height)?"
- If the answer is "Yes," the system nudges the person until the crowd looks the same regardless of their height.

How They Did It (The Two-Step Recipe)

To make this math work on complex data like images, they used a two-step process:

Step 1: The "Organizer" (The VAE)
First, they use a tool called a Variational Autoencoder (VAE). Think of this as a very organized librarian.

The librarian takes messy books (images) and puts them on a shelf.
They create a special rule: "Put all the 'Secret Leaf' books in one specific row (Row 0)."
Now, the sensitive information is neatly isolated in one corner of the library.

Step 2: The "Shuffler" (The New Encoder)
Now comes the magic. They take that specific row (Row 0) and run it through a new machine.

This machine looks at the "crowd" (using the nearest-neighbor rule mentioned above).
It asks: "Are the people in this row clustered together because of their secret?"
If they are, the machine shuffles them around until the crowd looks random.
Because the librarian (Step 1) did such a good job organizing the rest of the library, the "Favorite Color" (useful data) stays perfectly safe while the "Secret Leaf" gets scrambled.

Why This Matters (The Results)

The authors tested this on three things:

MNIST: Handwritten numbers with different background shapes.
FFHQ: Faces of people (removing gender while keeping expressions).
CheXpert: Medical X-rays (removing the presence of pacemakers while keeping the ability to diagnose lung issues).

The Results:

Better than the competition: Their "Smart Blender" removed the sensitive info better than previous unsupervised methods (methods that don't need a teacher to tell them what to remove).
Rivaled the experts: It performed almost as well as "supervised" methods (which do have a teacher), but without needing to know the answers in advance.
Robustness: Even when the data was messy or had "noisy" labels (wrong tags), their method actually helped the AI learn better because it stopped the AI from cheating by looking at the wrong clues.

The Takeaway

This paper introduces a clever way to "scrub" data of its secrets without throwing away the good stuff. By measuring how "crowded" the data points are and shuffling them until the crowd looks the same for everyone, they create a fairer, more robust dataset.

It's like taking a photo of a person, blurring out their gender so the AI can't tell if it's a man or a woman, but keeping the photo so sharp that the AI can still tell if they are smiling, frowning, or looking sick. This helps build AI that makes fair decisions without being biased by hidden clues.

1. Problem Statement

The paper addresses the critical challenge of removing unwanted statistical dependencies from data, specifically between input data ( $X$ ) and a sensitive variable ( $S$ ), while preserving the utility of the data for downstream tasks.

Context: Hidden dependencies (e.g., an object always appearing against a specific background, or medical devices correlating with patient demographics) can lead to unfairness, bias, and privacy leaks in machine learning models.
Goal: Learn a representation $Z$ that minimizes mutual information with the sensitive variable $I(Z; S)$ while maximizing information retention about the original data $I(Z; X)$ .
Limitations of Existing Methods:
- Adversarial Learning: Often unstable; the encoder learns to fool a specific adversary rather than achieving true statistical independence. A stronger adversary can still recover the sensitive attribute.
- Decorrelation/Contrastive Learning: Often rely on loose lower bounds or simple distance metrics that fail to capture complex, non-linear dependencies.
- VAE Regularization: Standard VAEs aim to remove all dependency between input and latent, making it difficult to balance reconstruction utility with specific dependency removal.

2. Methodology

The authors propose a two-stage pipeline that combines a specialized Variational Autoencoder (VAE) with a novel non-parametric density estimation loss.

Stage 1: Specialized VAE Pre-training

The input $x$ is first mapped to a latent space $z_{vae}$ using a VAE.

Modification: Unlike standard VAEs that use a global prior $N(0, I)$ , this VAE uses a sensitive-conditioned prior. The prior mean $\mu$ is set to $[s, 0, \dots, 0]$ , where $s$ is the sensitive label.
Effect: This forces the encoder to explicitly compress the sensitive information into a single latent dimension ( $z_0$ ), while the remaining dimensions are encouraged to be independent of $s$ . This creates a "smooth" latent space where sensitive information is isolated, simplifying the subsequent removal task.

Stage 2: Latent Encoder with Nearest-Neighbor Density Estimation

A secondary Multi-Layer Perceptron (MLP) encoder transforms $z_{vae}$ into the final representation $z_{enc}$ . The core innovation lies in the loss function used to train this encoder.

Objective: Minimize the Kullback-Leibler (KL) divergence between the marginal distribution of the latent $p(z)$ and the conditional distribution $p(z|s)$ . If $Z$ and $S$ are independent, $p(z|s) = p(z)$ .
Density Estimation: Instead of using parametric assumptions or kernel density estimation (which are computationally expensive and sensitive to bandwidth), the authors use a non-parametric nearest-neighbor (NN) density estimator based on the Kozachenko-Leonenko entropy estimator.
- Logic: The probability density $p(z)$ is estimated inversely proportional to the distance to the $M$ -th nearest neighbor.
- Loss Formulation: The mutual information is approximated by summing the log-ratio of distances to the $M$ -th neighbor in the conditional set ( $z|s$ ) versus the global set ( $z$ ).
- Stability Enhancements:
  - Dimension-wise Optimization: The encoder is trained separately for each latent dimension to prevent re-entanglement of sensitive information.
  - Noise Reduction: Gaussian kernel smoothing is applied to neighbor distances to mitigate the impact of noisy samples.
  - Squared Distance Approximation: To prevent loss explosions during early training, the log-ratio is initially approximated using squared distances before switching to the standard formulation.

3. Key Contributions

Direct Density Manipulation: The paper introduces a differentiable loss function based on nearest-neighbor density estimation to directly optimize for statistical independence, moving beyond adversarial proxies or loose lower bounds.
Hybrid Architecture: The combination of a sensitive-conditioned VAE (to isolate sensitive info) and a density-based latent encoder (to neutralize it) provides a robust framework for dependency suppression.
Superior Trade-off: The method achieves a better balance between removing sensitive information and preserving data utility compared to state-of-the-art unsupervised methods, and rivals supervised methods without requiring target labels during training.
Generalizability: The approach is applicable to both supervised and unsupervised downstream tasks, as it does not rely on target labels ( $Y$ ) during the preprocessing phase.

4. Experimental Results

The method was evaluated on three datasets: MNIST (with synthetic backgrounds), FFHQ (human portraits), and CheXpert (chest radiographs).

Metrics: Performance was measured by the accuracy of an attacker trying to predict the sensitive variable (lower is better) versus the accuracy of a classifier predicting the target task (higher is better).
Key Findings:
- MNIST: The proposed method outperformed all unsupervised baselines and even surpassed two out of three supervised methods in balancing background removal and digit recognition accuracy.
- FFHQ: It achieved a better trade-off than unsupervised models and outperformed a supervised contrastive model in removing gender bias while preserving smile detection and pose.
- CheXpert: On complex medical images, it was the strongest unsupervised approach for removing device information while maintaining diagnostic accuracy for lung conditions.
- Robustness: The method improved model generalization when training on datasets with noisy labels, as removing spurious correlations prevented overfitting.
- Ablation: Removing the nearest-neighbor encoder (relying only on the VAE) resulted in significantly worse trade-offs, confirming the necessity of the density estimation step.

5. Significance and Conclusion

This work represents a significant step forward in fair machine learning and privacy preservation.

Theoretical Impact: It demonstrates that explicit, differentiable density estimation via nearest-neighbor methods is feasible and effective for optimizing independence, challenging the dominance of adversarial approaches.
Practical Application: By enabling the training of models on "de-biased" transformed data that can be reconstructed back to the original space, organizations can train fair models without needing to alter the deployment environment. This ensures that models do not internalize spurious correlations, leading to more robust and equitable decision-making in real-world scenarios.
Future Directions: The authors suggest that while nearest-neighbor estimation is stable, exploring kernel-based methods or refining the estimator for non-smooth distributions (like GAN latents) remains a promising avenue for future research.