Using the Path of Least Resistance to Explain Deep Networks

This paper introduces Geodesic Integrated Gradients (GIG), a novel attribution method that replaces the straight paths of standard Integrated Gradients with geodesics under a model-induced Riemannian metric to eliminate feature-wise cancellation and produce more faithful explanations for deep neural networks.

Sina Salek, Joseph Enguehard

Published 2026-02-27
📖 5 min read🧠 Deep dive

The Big Problem: The "Straight Line" Trap

Imagine you are trying to explain to a friend how a complex machine (a Deep Learning Model) decided to identify a picture of a jet plane.

The most popular way to do this right now is called Integrated Gradients (IG). Think of IG like a hiker who insists on walking in a perfectly straight line from a "blank canvas" (a black image) to the "jet plane" image.

The Problem:
In the real world, the "landscape" of a neural network isn't flat. It has hills, valleys, and cliffs.

  • The Straight Line Flaw: If the hiker walks in a straight line, they might accidentally walk right over a steep cliff (a region where the model is very confused or changes its mind rapidly) or through a swamp (a region that looks like a jet but isn't).
  • The Result: The hiker blames the wrong features. In the paper's example, the straight-line method looked at a jet and said, "The wings don't matter because the straight line passed through a confusing area." It gave a misleading explanation.

The Solution: The "Path of Least Resistance"

The authors propose a new method called Geodesic Integrated Gradients (GIG).

Instead of forcing a straight line, GIG asks: "What is the easiest, smoothest path to get from the black image to the jet image, avoiding the cliffs and swamps?"

  • The Analogy: Imagine you are a river flowing from a mountain spring (the black image) to the ocean (the jet image). A river never walks in a straight line if there is a mountain in the way; it curves around the mountain to find the path of least resistance.
  • The Magic: GIG calculates a "map" of the model's sensitivity (where it is sensitive and where it is flat). It then finds the path that flows smoothly through the "flat" areas and only crosses the "steep" areas when absolutely necessary.

The New Rule: "No Cancellation"

The paper introduces a new rule for good explanations called No-Cancellation Completeness (NCC).

The Analogy:
Imagine you are balancing a checkbook.

  • Old Rule (Completeness): "The total sum of your transactions must equal your final balance."
    • The Loophole: You could have a huge deposit of $1,000 and a huge withdrawal of $1,000. The math adds up perfectly, but it hides the fact that you actually spent a lot of money. In AI, this means the model might say "Feature A is super important (+100)" and "Feature B is super unimportant (-100)," canceling each other out. The total is right, but the individual explanations are lies.
  • New Rule (NCC): "Not only must the total balance be right, but you cannot hide a massive withdrawal behind a massive deposit."
    • GIG ensures that if a feature is important, it gets a high score, and if it's not, it gets a low score. It doesn't let them cancel each other out to hide the truth.

How They Do It (The Two Tools)

Since finding the perfect "river path" is mathematically hard, the authors built two tools to approximate it:

  1. The "Neighborhood" Map (k-Nearest Neighbors):

    • Best for: Simple, low-dimensional data (like a 2D graph).
    • How it works: Imagine dropping thousands of pins on a map between the start and end points. You connect each pin to its closest neighbors. You then look for the path that requires the least "energy" to walk. It's like finding the shortest walking trail through a dense forest by hopping from tree to tree.
  2. The "Magnetic" Path (Stochastic Variational Inference):

    • Best for: Complex, high-dimensional data (like real photos).
    • How it works: Imagine a rubber band stretched between the start and end points. You place magnets around the rubber band that repel it if it gets too close to a "high gradient" (dangerous) zone. The rubber band naturally snaps into a curved shape that avoids the magnets. The computer simulates this snapping process to find the best path.

Why This Matters

  • More Honest Explanations: In tests with real images (like identifying animals in the Pascal VOC dataset), GIG was much better at pointing out the actual parts of the image that mattered (like the eyes of a cat) compared to the old straight-line method.
  • The Cost: The trade-off is speed. Finding the "river path" takes more computing power than drawing a straight line. It's like taking a scenic, safe detour versus driving straight through a dangerous shortcut. It's slower, but the destination is reached more reliably.

Summary

The paper argues that to understand AI, we shouldn't just draw a straight line from "nothing" to "something." We should follow the path of least resistance, curving around the confusing parts of the model's brain. This gives us a truer, more honest explanation of why the AI made its decision, ensuring that important features aren't hidden by mathematical tricks.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →