Digging Deeper: Learning Multi-Level Concept Hierarchies

The Big Picture: Teaching AI to Think in Layers

Imagine you are trying to teach a robot to recognize a Red Apple.

The Old Way (Flat Thinking): You tell the robot, "Look for 'Red' and look for 'Apple'." The robot learns these two things are separate. It doesn't understand that "Red" is a type of color, or that "Apple" is a type of fruit. It's like giving the robot a list of 1,000 unrelated words and hoping it figures out the connections on its own.
The Previous Upgrade (Shallow Thinking): Researchers realized this was too simple. They taught the robot, "Apple" is a big category, and inside that, there are sub-categories like "Red Apple" and "Green Apple." This is better, but it stops there. It's a two-story building: the ground floor (Apple) and the first floor (Red Apple).
The New Breakthrough (Deep Thinking): This paper introduces a way to teach the robot to build a skyscraper of understanding. It can go from "Fruit" $\rightarrow$ "Apple" $\rightarrow$ "Red Apple" $\rightarrow$ "Crunchy Red Apple."

The authors, Oscar Hill, Mateo Espinosa Zarlenga, and Mateja Jamnik, have created two new tools to make this happen: MLCS (the discovery tool) and Deep-HiCEM (the thinking machine).

1. The Problem: AI is Too "Flat"

Most AI models today are like a flat map. They know that "Dog" and "Cat" are animals, but they don't naturally understand that a "Golden Retriever" is a specific kind of dog.

To fix this, previous researchers tried to build a hierarchy (a family tree) for AI concepts. But they could only build one level deep.

Level 1: The big idea (e.g., "Vehicle").
Level 2: The sub-idea (e.g., "Car").
The Limit: They couldn't go deeper to "Sedan" or "Sports Car" without needing a human to manually label every single one. That takes forever and is expensive.

2. The Solution: MLCS (The "X-Ray" Machine)

The authors invented Multi-Level Concept Splitting (MLCS). Think of this as an X-ray machine for AI brains.

Usually, to teach an AI about "Red Apples," you need thousands of photos labeled "Red Apple." MLCS is magic because it doesn't need those labels.

How it works: You give the AI a picture of an apple and just say, "This is a fruit."
The Magic: MLCS looks inside the AI's "brain" (its internal math) and says, "Hey, I see a pattern here that looks like 'Red,' and inside that, I see a pattern that looks like 'Shiny'."
The Result: It automatically discovers the hidden layers of the hierarchy (Fruit $\rightarrow$ Apple $\rightarrow$ Red Apple) without anyone telling it what to look for. It's like finding a secret underground tunnel system in a building you thought was just a single floor.

3. The Solution: Deep-HiCEM (The "Skyscraper" Architect)

Once MLCS finds these hidden layers, you need a new type of building to house them. Enter Deep-HiCEM.

Think of a standard AI model as a bungalow (one floor). The old "Hierarchical" models were two-story houses. Deep-HiCEM is a skyscraper.

Arbitrary Depth: It can have as many floors as needed. It can handle "Animal" $\rightarrow$ "Mammal" $\rightarrow$ "Dog" $\rightarrow$ "Poodle" $\rightarrow$ "Fluffy Poodle."
Intervention (The "Human-in-the-Loop"): This is the coolest part. Because the AI understands the hierarchy, you can talk to it like a human.
- Scenario: The AI thinks a picture is a "Dog," but you know it's actually a "Wolf."
- Old AI: You have to retrain the whole thing or guess which specific feature is wrong.
- Deep-HiCEM: You can just say, "No, that's not a dog, it's a wolf." Because the AI knows "Wolf" is a cousin of "Dog" but not a "Dog," it instantly updates its understanding of the whole picture. You can fix the AI's mistakes in real-time by correcting the concepts.

4. What Did They Prove? (The Results)

The team tested this on several datasets, including a made-up "PseudoKitchen" where they had ingredients (like "Apple") and sub-ingredients (like "Red Apple").

Did it find the hidden layers? Yes! The AI successfully discovered "sub-sub-concepts" (like specific colors or textures) that were never shown to it during training.
Did it get smarter? Yes. The AI was just as good at guessing the final answer (e.g., "Is this a fruit?") as the old models, but now it had a much richer understanding of why.
Can we fix it? Yes. When the researchers manually corrected the AI's concepts (e.g., "Actually, this is a red apple, not a green one"), the AI's final answer got better. It proved that the AI was listening to the hierarchy.

The Takeaway

This paper is about giving AI a deeper, more human-like way of thinking.

Instead of just memorizing a flat list of facts, the AI learns to organize knowledge into a family tree. It learns that a "Red Apple" isn't just a random word; it's a specific type of "Apple," which is a specific type of "Fruit."

Why does this matter?

Less Work: We don't need humans to label every tiny detail. The AI can find the details itself.
More Trust: If an AI makes a mistake, we can understand where in the hierarchy it went wrong and fix it easily.
Better Explanations: Instead of saying "I think this is a cat because of pixels," the AI can say, "I think this is a cat because it has fur, whiskers, and pointy ears," and explain how those features fit together.

In short, they taught the AI to stop looking at the world in a flat line and start seeing the depth of reality.

1. Problem Statement

Concept-based models (CBMs) aim to make neural networks interpretable by predicting human-understandable concepts (e.g., "color," "shape") to explain decisions. However, existing approaches suffer from two main limitations:

Flat and Independent Assumptions: Most models treat concepts as independent, flat entities, failing to capture the inherent hierarchical relationships found in human cognition and real-world data (e.g., a "red apple" is a sub-concept of "apple," which is a sub-concept of "fruit").
Annotation Costs and Shallow Hierarchies: While recent methods like Hierarchical Concept Embedding Models (HiCEMs) and Concept Splitting introduced the ability to discover sub-concepts from coarse labels, they are restricted to shallow hierarchies (only one layer of sub-concepts). They cannot capture deeper structures (e.g., sub-sub-concepts) without exhaustive, multi-level annotations, which are expensive to obtain.

2. Methodology

The authors propose a two-part framework to overcome these limitations: Multi-Level Concept Splitting (MLCS) for discovery and Deep-HiCEM for modeling.

A. Multi-Level Concept Splitting (MLCS)

MLCS is a method to discover multi-level concept hierarchies using only top-level supervision (coarse labels).

Core Mechanism: It replaces the single-level Sparse Autoencoder (SAE) used in previous Concept Splitting methods with a Hierarchical Sparse Autoencoder (HiSAE).
HiSAE Architecture:
- Top-Level Encoder: Maps input embeddings to a dictionary of $K$ top-level latents (candidate sub-concepts).
- Sub-Encoders: For each active top-level latent, a dedicated sub-encoder maps the input to a sub-dictionary of size $K_s$ (capturing finer refinements or sub-sub-concepts).
- Gating Mechanism: The sub-level is gated by the top-level; sub-latents only contribute if their parent latent is active. This enforces a strict parent-child relationship.
- Training: The model is trained with a standard Mean Squared Error (MSE) reconstruction loss.
Outcome: This allows the model to discover a tree of concepts (e.g., Ingredient $\to$ Apple $\to$ Red Apple) without requiring ground-truth labels for the lower levels.

B. Deep-HiCEM Architecture

Deep-HiCEM is the neural architecture designed to represent and utilize the hierarchies discovered by MLCS.

Tree Structure: Concepts are organized into a tree where nodes represent concepts with positive and negative states.
Embedding Generation:
- For a top-level concept $c_i$ , the model learns embeddings for its active ( $\hat{c}^+_i$ ) and inactive ( $\hat{c}^-_i$ ) states.
- These embeddings are passed through Sub-Concept Modules (positive and negative) which incorporate information about the concept's descendants.
- If a concept has no sub-concepts, the embedding remains unchanged.
Intervention Support: The architecture supports interventions at any level of abstraction. If a human corrects a sub-sub-concept (e.g., "red apple"), the model can propagate this update to the parent concept ("apple") and the final task prediction.

3. Key Contributions

MLCS: A novel method for discovering multi-level concept hierarchies from embeddings trained solely with top-level supervision, overcoming the single-layer limitation of previous work.
Deep-HiCEM: An architecture capable of modeling arbitrarily deep concept hierarchies and enabling human interventions at any level of the hierarchy.
Empirical Validation: Demonstration that deep hierarchies can be discovered without exhaustive annotations while maintaining high task accuracy and enabling actionable interventions.

4. Experimental Results

The authors evaluated their approach on five datasets: MNIST-ADD, SHAPES, Caltech-UCSD Birds (CUB), Animals with Attributes 2 (AwA2), and a synthetic PseudoKitchens-2 dataset (specifically designed for multi-level discovery).

RQ1: Discoverability of Hierarchies:
- MLCS successfully discovered human-interpretable sub-concepts and sub-sub-concepts.
- On PseudoKitchens-2, the mean ROC-AUC for discovered sub-concepts was 0.80, and for sub-sub-concepts was 0.79.
- These scores are comparable to HiCEMs using single-level Concept Splitting, proving that adding depth does not degrade the quality of discovered concepts.
RQ2: Task Accuracy:
- Deep-HiCEMs achieved task accuracies competitive with standard HiCEMs and other baselines (CBMs, CEMs, Black-box models).
- The task accuracy of Deep-HiCEMs was never more than 1% lower than standard HiCEMs, indicating that modeling deeper hierarchies does not sacrifice predictive performance.
RQ3: Interventions:
- Intervening on discovered concepts generally improved task accuracy (e.g., correcting a misidentified "red apple" improved the final classification).
- Limitation: In some cases (notably PseudoKitchens-2), interventions on discovered concepts slightly decreased accuracy. The authors hypothesize this is due to inaccuracies or biases in the automatically discovered labels, suggesting a need for future refinement.
- Interventions on provided top-level concepts worked equally well in Deep-HiCEMs and HiCEMs.

5. Significance and Conclusion

This work represents a significant step toward principled design for Trustworthy AI by:

Reducing Annotation Burden: It eliminates the need for exhaustive, multi-level concept annotations, relying instead on coarse top-level labels.
Enhancing Interpretability: By capturing multi-level relationships, the models better reflect human cognitive structures, allowing for more granular debugging and explanation.
Enabling Flexible Control: The ability to intervene at multiple levels of abstraction (from broad categories to specific variants) offers more precise human-in-the-loop control over AI decision-making.

While the authors note limitations regarding the consistency of intervention benefits and the inherent uncertainty of SAEs in discovering meaningful concepts, the framework establishes a viable path toward deeper, more expressive, and human-aligned concept-based interpretability.