Sufficient, Necessary and Complete Causal Explanations in Image Classification

Imagine you have a very smart, but mysterious, robot that looks at pictures and guesses what they are. Sometimes it's right, sometimes it's wrong. You want to know: "How did the robot decide that?"

Most current tools try to answer this by highlighting the "important" parts of the picture, like a teacher circling key words in a textbook. But these tools are often guesswork—they don't have a strict mathematical proof that their answer is correct.

This paper introduces a new, super-rigorous way to explain the robot's brain. The authors, David Kelly and Hana Chockler, treat the image like a puzzle where every pixel is a piece. They want to find the exact pieces that must be there for the robot to make its decision, and the pieces that could be removed without changing the decision.

Here is the breakdown of their new "Causal Explanation" method using simple analogies:

1. The Three Types of "Why"

The authors break down the explanation into three distinct categories, like sorting ingredients in a recipe:

Sufficient (The "Just Enough" Recipe):
Imagine you are baking a cake. If you have a tiny bowl of flour, sugar, and eggs, that might be enough to make a tiny cake. In the paper, a "Sufficient Explanation" is the smallest possible group of pixels that, if you kept them and erased the rest of the image, the robot would still say, "That's a ladybug!"
- Analogy: It's the minimum amount of fuel needed to start a fire.
Necessary (The "Can't Live Without" Ingredients):
Now, imagine you have the whole cake. If you take away the flour, the cake collapses. A "Necessary Explanation" is the set of pixels that must be there. If you remove even one of these, the robot changes its mind.
- Analogy: The flour in the cake. Without it, you don't have a cake.
Complete (The Perfect Balance):
This is the holy grail. A "Complete Explanation" is a group of pixels that is both Sufficient (it's enough to make the decision) and Necessary (you can't remove any of them without changing the decision).
- Analogy: It's the exact, perfect portion of ingredients. Not a drop more, not a drop less.

2. The "Confidence" Twist (The Volume Knob)

The authors realized that just getting the right answer isn't enough; the robot needs to be confident in its answer.

The Problem: Sometimes, a tiny group of pixels is enough to make the robot say "Ladybug," but the robot is only 10% sure. It's a weak guess.
The Solution ( $\delta$ -Complete): They introduced a "confidence threshold." They ask: "Give me the smallest group of pixels that makes the robot say 'Ladybug' and be at least 80% sure."
The 1-Complete (The "Full Faith" Explanation): This is the ultimate goal. It finds the pixels needed to get the robot to be exactly as confident as it was when looking at the full, original photo.

3. The "Adjustment Pixels" (The Seasoning)

This is the most fascinating discovery. Sometimes, the "Complete" pixels get the robot to the right answer, but the confidence level is slightly off.

The Scenario: The robot looks at a picture of a sink. The "Complete" pixels (the faucet and basin) make it say "Sink," but the confidence drops a little.
The Adjustment: The robot needs a few extra pixels (maybe the reflection on the water or a specific shadow) to boost its confidence back up to the original level.
The Metaphor: Think of the "Complete" pixels as the main course of a meal. The "Adjustment Pixels" are the salt and pepper. They aren't the main dish, but without them, the flavor (confidence) isn't quite right.

4. Why is this better than what we have now?

No "Inside Look" Needed: Most advanced AI tools need to peek inside the robot's brain (looking at its internal code or gradients) to explain things. This new method is Black-Box. It treats the robot like a mystery box: you put an image in, and you get an answer out. You don't need to know how the robot works internally to figure out what it's looking at.
It Works on Any Model: Whether the robot is a simple one or a complex, deep-learning monster, this method works.
It's Fast: The authors built a tool that can do this math on a standard computer in about 6 seconds per image.

5. What did they find?

They tested this on three famous AI models (ResNet, MobileNet, and Swin) and found that different models think differently:

ResNet is very efficient. It needs very few pixels to be sure of its answer.
MobileNet is a bit more "needy," requiring more pixels to feel confident.
The "Inverse" Discovery: When they removed the "Complete" pixels from an image, the robot often saw something completely different. For example, if you remove the pixels that make a "Colobus Monkey" look like a Colobus, the robot might just see a generic "Monkey" or even a "Guenon Monkey." This helps us understand exactly what features distinguish similar-looking things.

Summary

This paper gives us a new, mathematically strict way to interrogate AI. Instead of just saying, "The robot looked at the ears," they can say: "The robot looked at only these 50 pixels to be 100% sure it's a cat. If you remove any of those 50, it stops being a cat. If you add these other 20 pixels, it becomes more sure it's a cat."

It turns the "black box" of AI into a transparent puzzle where we can see exactly which pieces matter and why.

1. Problem Statement

Existing Explainable AI (XAI) methods for image classification generally fall into two categories, both with significant limitations:

Heuristic/Saliency Methods: Approaches like Grad-CAM, LIME, and Shapley values are computationally efficient and work on black-box models but lack formal rigor. They often produce explanations that are not mathematically guaranteed to be sufficient or necessary for the model's decision.
Logic-Based Methods: Approaches based on formal logic (e.g., Abductive explanations/Prime Implicants) offer rigorous guarantees of sufficiency and necessity. However, they rely on strict assumptions (such as model monotonicity or linearity) and require explicit access to the model's internal structure. These assumptions rarely hold for modern deep learning image classifiers, making logic-based methods computationally intractable or inapplicable.

The Gap: There is a lack of a framework that provides formal, rigorous guarantees (like logic-based methods) while remaining applicable to black-box, non-monotonic image classifiers (like deep neural networks).

2. Methodology

The authors propose a framework based on Actual Causality (specifically the modified Halpern-Pearl definition) to define and compute causal explanations for image classifiers.

A. Causal Modeling of Images

The authors model an image classifier $N$ and an input image $x$ as a depth-2 binary causal model $M_{N,x}$ :

Variables: Endogenous variables $\vec{V}$ correspond to pixels (where 1 = original value, 0 = masked). An output variable $O$ indicates if the classification remains unchanged.
Intervention: Masking pixels corresponds to setting variables to 0.
Context: The original image is a context where all variables are 1.

B. New Definitions of Explanations

The paper introduces a hierarchy of causal explanations, extending previous work (ReX) to include necessity, completeness, and confidence:

Sufficient Explanations: A minimal set of pixels that, when kept visible (others masked), is enough to trigger the same classification.
- Single-Context (SCSE): Works for the specific image context.
- Multi-Context (MCSE): Works across a set of contexts (partial maskings), equivalent to logic-based Abductive explanations.
Necessary Explanations (NE): A minimal set of pixels that, if removed (masked), causes the classification to change.
Complete Explanations (SCCE/MCCE): A subset of pixels that is both sufficient and necessary. Removing these pixels changes the class; keeping only them preserves the class.
$\delta$ -Confident Explanations: Introduces a confidence threshold. An explanation is $\delta$ $δ$ -confident if the model's confidence on the explanation is at least $\delta \times$ $δ \times$ (original confidence).
- 1-Complete Explanation: A complete explanation where the model's confidence matches the original image's confidence exactly.
Adjustment Pixels: Pixels that are neither strictly sufficient nor necessary but are required to adjust the model's confidence to match the original image (bridging the gap between a $\delta$ -complete and a 1-complete explanation).

C. Theoretical Equivalence

The authors prove that:

Multi-context sufficient explanations (MCSE) are formally equivalent to Abductive Explanations (Prime Implicants) in logic.
Causal explanations are Input Invariant, meaning they remain stable under transformations that do not affect the output (e.g., mean-shifting the first layer of a network), a property often violated by gradient-based saliency methods.
The decision problems for finding these explanations are co-NP-complete, justifying the need for approximation algorithms.

D. Algorithms

Since exact computation is intractable, the authors implement black-box approximation algorithms based on ReX (Responsibility-based Explanation):

Ranking: Pixels are ranked by "causal responsibility" (how much they influence the decision).
Greedy Search:
- For Sufficiency: Start with a blank image and add pixels (in order of responsibility) until the classification is restored with sufficient confidence.
- For Necessity/Completeness: Start with the full image and remove pixels until the classification changes.
- For 1-Completeness: Iteratively add "adjustment pixels" to a $\delta$ -complete explanation until the confidence matches the original image.
Black-Box Nature: The algorithms require no gradients, no model weights, and no knowledge of architecture. They only query the model's output.

3. Key Contributions

Formal Framework: Established a rigorous causal framework for image classification that unifies the formal guarantees of logic-based methods with the black-box applicability of heuristic methods.
New Explanation Types: Defined and formalized Complete, $\delta$ -Confident, and 1-Complete explanations, introducing the concept of Adjustment Pixels to analyze how confidence is modulated.
Theoretical Proofs: Proved the equivalence between causal MCSEs and logic-based Abductive explanations and established the co-NP-completeness of the problem.
Algorithm Implementation: Developed efficient, totally black-box algorithms to compute these explanations, integrated into the open-source tool ReX.
Empirical Analysis: Provided the first large-scale analysis of how different architectures (ResNet50, MobileNet, Swin Transformer) utilize pixels for sufficiency, necessity, and confidence.

4. Experimental Results

The authors evaluated their approach on three models (ResNet50, MobileNet, Swin Transformer) across three datasets (ImageNet-1K, PascalVOC, ECSSD).

Efficiency: The algorithms are efficient, averaging 6 seconds per image for ResNet50 and MobileNet (16s for Swin Transformer) on an A100 GPU.
Model Differences:
- ResNet50 required the fewest pixels for sufficiency and completeness, indicating a more focused reliance on specific features.
- MobileNet and Swin Transformer required significantly more pixels for sufficiency.
- Adjustment Pixels: The study revealed that different models use different numbers of "adjustment pixels" to reach 1-completeness, highlighting distinct internal mechanisms for confidence calibration.
Inverse Classification: By masking complete explanations, the authors derived "inverse classifications" (what the model sees when the key features are removed). They found that the semantic distance between the original class and the inverse class is often small (e.g., different monkey species), but can be large in cases of misclassification (e.g., "ox" vs. "moped").
Comparison with XAI Tools: When applied to rankings from Grad-CAM and LIME, the causal algorithms successfully found sufficient and complete explanations. However, ReX (using causal responsibility) produced the most precise (smallest) sufficient explanations on average compared to LIME and Grad-CAM, which showed higher variance and architecture dependence.

5. Significance

Bridging the Gap: This work successfully bridges the divide between formal logic (rigor) and deep learning (complexity/black-box), proving that causal reasoning can provide mathematically sound explanations for complex image classifiers without needing model internals.
Confidence Analysis: By introducing confidence-aware explanations ( $\delta$ and 1-complete), the paper moves beyond binary "correct/incorrect" explanations to analyze how a model arrives at a specific confidence level, revealing the role of "adjustment pixels."
Practical Utility: The algorithms are practical, efficient, and model-agnostic, making them suitable for auditing real-world AI systems where model internals are inaccessible.
Insight into Model Behavior: The findings suggest that different architectures have fundamentally different "patterns of reasoning" regarding which pixels are necessary vs. sufficient, offering a new lens for model comparison and debugging.