Explainable embeddings with Distance Explainer

This paper introduces Distance Explainer, a novel post-hoc XAI method that adapts saliency-based techniques to generate robust, local explanations for the similarity or dissimilarity between data points in embedded vector spaces by assigning attribution values through selective masking and distance-ranked filtering.

Christiaan Meijer, E. G. Patrick Bos

Published 2026-03-26
📖 5 min read🧠 Deep dive

Imagine you have a giant, invisible library where every book, photo, and song is assigned a specific seat based on how similar it is to everything else. This is what AI calls an "embedded space."

In this library:

  • A picture of a bee sits right next to a picture of a fly.
  • A picture of a bee sits far away from a picture of a car.
  • A sentence saying "a bee on a flower" sits right next to the picture of the bee.

The problem? We humans can't see why the AI decided to put the bee next to the fly. The AI just knows they are "close" in its mathematical map, but it can't tell us which parts of the bee (the wings? the stripes?) made it look like a fly.

This paper introduces a new tool called Distance Explainer to solve this mystery. Here is how it works, using simple analogies.

The Problem: The "Black Box" Distance

Think of the AI as a judge in a contest. It looks at two contestants (say, a photo of a bee and a photo of a fly) and says, "These two are 90% similar."
But if you ask, "Why?" the AI usually just shrugs. It's like a judge who gives a score but won't explain which specific move earned the points.

The Solution: The "Masking Game"

The authors took an existing trick called RISE (which was used to explain single images) and adapted it for comparing two things. They call their new method Distance Explainer.

Here is the step-by-step process, imagined as a game of "Hide and Seek":

  1. The Setup: You have your "Target" (the bee photo) and a "Reference" (the fly photo).
  2. The Game: The AI plays a game where it randomly covers up (masks) parts of the Target photo with a black blanket.
    • Example: It covers the bee's wings.
    • Example: It covers the bee's stripes.
    • Example: It covers the bee's eyes.
  3. The Check: After covering a part, the AI asks: "Does this covered-up bee still look like the fly?"
    • If you cover the wings, the bee suddenly looks very different from the fly. The distance between them grows huge.
    • If you cover the stripes, the bee still looks a bit like the fly. The distance doesn't change much.
  4. The Scorecard: The AI repeats this thousands of times, covering different random spots.
    • It keeps a tally: "Every time we covered the wings, the similarity dropped drastically."
    • "Every time we covered the stripes, the similarity stayed the same."
  5. The Result: The AI draws a heat map (a picture with red and blue colors).
    • Red areas are the parts that made the two images dissimilar (if you hide them, they look more alike).
    • Blue areas are the parts that made them similar (if you hide them, they look less alike).

Why This is Special

Previous tools could only explain why an AI thought, "This is a bee." This new tool explains relationships. It answers: "Why does the AI think this bee is closer to a fly than to a car?"

It works like a detective who doesn't just look at the crime scene, but compares two suspects side-by-side to see exactly what features make them look alike or different.

The "Mirror" Trick

The authors added a clever twist called the "Mirror Mode."
Imagine you are trying to hear a whisper in a noisy room. If you listen to the noise and subtract it, you hear the whisper better.

  • The AI looks at the parts that make the images very different (the "noise").
  • It also looks at the parts that make them very similar.
  • By comparing these two lists, it cancels out the "noise" and highlights the true, important features. This makes the explanation much clearer and less fuzzy.

What They Found

They tested this on:

  • Images vs. Images: Showing that the AI knows a bee and a fly are similar because of their wings, but different because of their stripes.
  • Images vs. Text: Showing that if you show a picture of a bee and type "a fly," the AI knows exactly which parts of the picture contradict the text.

The Takeaway

This tool is like giving a pair of glasses to the AI. Suddenly, we can see why the AI's internal map is arranged the way it is. It doesn't just tell us the distance between two points; it tells us which features are pulling them together and which are pushing them apart.

This is a huge step forward for Explainable AI (XAI) because it helps us trust complex models (like those used in medical diagnosis or self-driving cars) by showing us the specific reasons behind their decisions, rather than just the final result.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →