VISIONLOGIC: From Neuron Activations to Causally Grounded Concept Rules for Vision Models

Imagine you have a brilliant but mysterious chef (the AI) who can cook a perfect steak every time. You ask, "Why did you add salt?" The chef might point to a pile of salt shakers on the counter and say, "Because salt is usually here when I cook steak."

That's how most current AI explainers work. They look for correlations (things that happen together). But what if the salt shaker is just sitting there because the chef always cooks near a salty window, not because the salt is actually needed for the steak? The chef might be "hallucinating" a reason, and the explanation is misleading.

VISIONLOGIC is a new framework that acts like a detective to find the real reasons the chef cooks the way they do. It doesn't just look at what's on the counter; it tests what happens if you remove an ingredient.

Here is how VISIONLOGIC works, broken down into three simple steps:

1. The "Light Switch" Translation (Neuron to Predicate)

Deep learning models are made of millions of tiny switches called "neurons." When a neuron fires, it's like a light switch turning on.

The Old Way: Researchers tried to guess what each light switch meant by looking at pictures where it was on. It was like guessing what a light switch controls just by seeing the room lit up.
The VISIONLOGIC Way: They teach the AI to translate these messy light switches into simple Yes/No rules (called "predicates"). Instead of "Neuron 45 is at 0.87 intensity," it becomes "Is the 'squirrel tail' present? YES." It turns the AI's complex math into a simple checklist.

2. The "What If?" Test (Causal Grounding)

This is the magic part. Most methods stop at the checklist. VISIONLOGIC goes further to prove the checklist items actually cause the decision.

The Analogy: Imagine the AI says, "I think this is a squirrel because I see a tail."
The Test: VISIONLOGIC takes the picture of the squirrel and digitally erases the tail (replaces it with static noise).
- If the AI suddenly says, "I don't know what this is anymore," then the tail is causally important. The AI needed the tail to make the decision.
- If the AI still says, "That's a squirrel!" even without the tail, then the tail wasn't the real reason. Maybe the AI was just looking at the background trees.
VISIONLOGIC does this over and over, shrinking the erased area until it finds the exact pixel-perfect spot that matters. It's like a sculptor chipping away stone until only the essential shape remains.

3. The "Rulebook" (Logical Rules)

Once it has proven which features are truly important, VISIONLOGIC writes a simple rulebook for the AI.

Instead of a black box, you get a clear sentence like:

"IF (Squirrel Tail is present) AND (Squirrel Head is present) AND (No Dog Ears are present) THEN: It is a Squirrel."

Why is this a big deal?

It Catches "Cheaters": Imagine an AI that thinks "Cows" are just "Green Grass." If you show it a cow in a desert, it fails. Old methods would say, "The AI sees grass, so it thinks it's a cow." VISIONLOGIC would test this, realize the grass isn't the cause of the cow decision, and say, "No, the AI is actually looking at the horns and the udder." It finds the truth, not the coincidence.
It Works on Any AI: Whether the AI is an old-school brain (CNN) or a modern transformer (ViT), VISIONLOGIC can translate its thoughts into human-readable logic.
Humans Trust It More: In tests with real people, VISIONLOGIC helped humans understand how the AI was thinking much better than previous methods. People could actually predict what the AI would do next because the rules made sense.

The Bottom Line

VISIONLOGIC is like giving the AI a translator that speaks "Human Logic" instead of "Math." It doesn't just guess what the AI is thinking; it proves it by testing the AI's decisions, ensuring that the explanations we get are based on real cause-and-effect, not just lucky guesses. This makes AI safer and more trustworthy for important jobs, like diagnosing diseases or driving cars.

1. Problem Statement

Deep learning vision models (both CNNs and Vision Transformers) suffer from a "black-box" nature, hindering trust and deployment in high-stakes scenarios. While concept-based explanation methods (e.g., TCAV, ACE, CRAFT) have improved interpretability by mapping internal representations to high-level semantic concepts, they possess a critical flaw:

Correlational vs. Causal: Existing methods rely almost entirely on statistical correlations between concepts and model outputs. They lack causal validation, meaning they often identify spurious correlations (e.g., associating "pasture" with "cow" because they co-occur in training data) rather than genuine causal features.
Lack of Global Logic: Current approaches often provide local attributions or concept scores but fail to generate global, interpretable logical rules that explain the model's decision-making process as a whole.

2. Methodology: The VISIONLOGIC Framework

VISIONLOGIC is a neural-symbolic framework that bridges neural activations and symbolic reasoning. It operates in three distinct stages to produce faithful, hierarchical explanations:

Stage 1: Deriving Predicates from Neuron Activations

The framework transforms continuous neuron activations into discrete, binary predicates (logical atoms).

Learning Thresholds: Instead of using fixed heuristics, VISIONLOGIC learns per-channel activation thresholds ( $T_j$ ) and sharpness parameters ( $s_j$ ).
Rank-Awareness: To handle the fact that a neuron's importance varies by input, it defines predicates based on both activation magnitude and contribution rank. A predicate $p_{j, \le k}(x)$ is true if the neuron's contribution to the class logit is within the top- $k$ contributors and its activation exceeds the threshold.
Polysemanticity Handling: The framework allows a single channel to encode multiple features (e.g., positive and negative branches for GELU activations) and uses structured sparsity (group lasso) to select the most predictive rank window ( $k \in \{1, 2, 3\}$ ) per channel, preventing predicate proliferation.
Distillation: A lightweight linear head is trained to mimic the base model's predictions using these predicates, ensuring the learned thresholds are stable and predictive.

Stage 2: Inducing Global Logical Rules

Once predicates are defined, the framework constructs symbolic rules to approximate the model's class-level decision logic.

Disjunctive Normal Form (DNF): It aggregates the predicate patterns observed in correctly classified training examples to form a DNF rule for each class.
Rank-Based Inference Score: Since exact DNF matching is brittle on unseen data, VISIONLOGIC uses a rank profile. It calculates an explanation score $S(x, c)$ for a class $c$ based on the average rank of the active predicates for that class. The predicted class is the one with the lowest score (i.e., the class whose characteristic predicates best explain the active features).
Interpretability: This results in compact, human-readable rules (e.g., "If feature A is present AND feature B is present, then Class X").

Stage 3: Causal Grounding via Ablation

This is the core differentiator. VISIONLOGIC grounds abstract predicates to specific visual regions using causal tests.

Iterative Refinement: Starting with a large bounding box (seeded from feature maps for CNNs or patch grids for ViTs), the algorithm iteratively shrinks the region.
Causal Test: It replaces the region with noise (ablation). If the predicate deactivates ( $1 \to 0$ ) upon ablation, the region is deemed causally necessary.
Sufficiency Check: It verifies that the region alone is sufficient to trigger the predicate by pasting it onto a noise canvas.
Segmentation Refinement: To align with object boundaries, the refined box is intersected with segmentation masks (e.g., from SAM or Mask R-CNN) and re-validated.
Concept Formation: Validated regions across multiple images of the same class are aggregated to form consistent, causally grounded visual concepts.

3. Key Contributions

Causal Validation: VISIONLOGIC is the first framework to rigorously validate concepts using ablation-based causal tests, moving beyond correlation to ensure discovered features are necessary for the model's decision.
Neural-Symbolic Integration: It successfully bridges neural representations and symbolic reasoning by learning activation thresholds to create a reusable predicate vocabulary and inducing global logical rules.
Iterative Localization Algorithm: It proposes an efficient algorithm for precisely localizing causally relevant image regions, combining bounding box refinement with off-the-shelf segmentation for high precision.
Scalability: The method is demonstrated on large-scale modern architectures (ResNet, ConvNeXt, ViT, Swin) trained on ImageNet-1k, a scale where explicit rule extraction was previously infeasible.

4. Experimental Results

Human Evaluation (Utility Study)

Setup: A large-scale study with 531 participants compared VISIONLOGIC against baselines (no explanation), saliency maps (Control), and state-of-the-art concept methods (ACE, CRAFT) across three scenarios: bias detection (Husky vs. Wolf), novel strategy identification (Otter vs. Beaver), and failure analysis (Fox vs. Red Fox).
Findings: VISIONLOGIC achieved significantly higher utility scores (participants' ability to predict model behavior on unseen images) than all other methods.
- In the bias detection scenario, VISIONLOGIC scored 1.25 (normalized), significantly outperforming ACE (1.01) and CRAFT (0.89).
- Statistical tests (Kruskal-Wallis and Dunn's test) confirmed these improvements were statistically significant ( $p < 0.05$ ).

Model Performance & Fidelity

Predictive Power: VISIONLOGIC largely retains the discriminative power of the base models. On the ImageNet validation set, it achieved Top-5 accuracy > 90% for covered images across all architectures (e.g., ViT: 97.38%, ConvNeXt: 97.23%).
Coverage & Fidelity: It covers 80–89% of images with valid explanations and maintains high fidelity (76–88%) on covered images, meaning the symbolic rules accurately reflect the neural network's decisions.

Qualitative Analysis

Concept Discovery: The system successfully identified semantically meaningful concepts (e.g., "bird beak," "fox ears," "church tops") that align with human intuition.
Polysemanticity: It revealed that single predicates can encode multiple concepts (e.g., a predicate detecting both "fox ears" and "church tops" due to shared triangular geometry), and conversely, one concept can be encoded by multiple predicates.
Global vs. Local: The framework distinguishes between local modular concepts and global structural predicates (e.g., a predicate encoding the entire shape of a squirrel).

5. Significance and Impact

Trustworthy AI: By providing causally grounded explanations, VISIONLOGIC addresses the fundamental limitation of spurious correlations in current interpretability methods, offering more reliable insights for high-stakes applications.
Bridging the Gap: It unifies the high-dimensional, opaque representations of deep neural networks with human-understandable symbolic logic, making complex models more transparent.
Future Directions: The work opens avenues for using these interpretable rules to improve model robustness, detect biases, and potentially enhance generalization by leveraging the discovered causal structure.

In summary, VISIONLOGIC represents a paradigm shift from "what the model looks at" (correlation) to "what the model relies on" (causation), delivering explanations that are not only human-interpretable but also methodologically rigorous.