G-LoG Bi-filtration for Medical Image Classification

The Big Picture: Teaching Computers to "See" Like a Doctor

Imagine you are trying to teach a computer to look at an X-ray or an MRI scan and tell if a patient has a disease. Usually, we teach computers by feeding them millions of images and letting them learn patterns, kind of like how a child learns to recognize a cat by seeing many different cats. This is called Deep Learning.

However, Deep Learning has some problems:

It needs a massive amount of data.
It's a "black box"—we often don't know why it made a decision.
It can get confused by noise (like static on an old TV).

This paper introduces a new, smarter way to teach the computer. Instead of just showing it the raw picture, the authors give the computer a topological map of the image. Think of it as giving the computer a "skeleton" of the image that highlights the most important shapes and connections, ignoring the messy details.

The Problem with the Old Way (Single-Parameter)

Traditionally, scientists used a method called Persistent Homology to create these maps. Imagine you are looking at a landscape.

The Old Method: You might only look at the height of the mountains. You say, "If the mountain is higher than 100 feet, it's a peak."
The Flaw: This is too simple. Two mountains might be the same height, but one is a sharp, dangerous peak, and the other is a gentle hill. Looking only at height misses the shape.

The New Solution: G-LoG (The "Double-Check" System)

The authors created a new method called G-LoG Bi-Filtration. They realized that to understand an image fully, you need to look at it through two different lenses at the same time.

Think of it like inspecting a piece of fruit:

Lens 1 (The Gaussian Filter): This is like looking at the fruit to see its overall smoothness and color. It blurs out the tiny scratches and dirt (noise) so you can see the big shape.
Lens 2 (The Laplacian of Gaussian): This is like looking at the fruit to find the edges and textures. It highlights where the skin changes from green to red, or where a bruise starts.

The Magic Trick:
The authors realized that if you just look at these two things separately, you don't get much new information. But, if you look at where they overlap (the "intersection"), you get a super-powerful description of the object.

Analogy: Imagine trying to find a specific person in a crowd.
- Method A: "Find everyone wearing a red hat."
- Method B: "Find everyone wearing blue shoes."
- Method C (G-LoG): "Find everyone wearing a red hat AND blue shoes."
- Method C is much more specific and accurate. It filters out the "noise" of the crowd and finds exactly who you are looking for.

Why is this Special? (The Stability Proof)

The paper also proves mathematically that this method is stable.

Analogy: Imagine you are drawing a map of a city. If you make a tiny mistake in your pencil (like a smudge or a slight wobble), the map shouldn't suddenly look like a completely different city.
The authors proved that if the medical image has a little bit of "noise" (like a slight blur or a pixel error), their G-LoG method won't get confused. The "map" it creates stays consistent. This is crucial for medical safety; you don't want a diagnosis to change just because the X-ray was slightly blurry.

The Results: Simple Tools, Big Wins

The authors tested this on the MedMNIST dataset, which is a giant collection of medical images (like a "Hello World" for medical AI).

They compared their method against:

Old Topological methods (Single-lens).
Super-complex Deep Learning models (like ResNet and Google's AutoML), which are like giant, heavy supercomputers.

The Surprise:
They used a very simple, lightweight computer brain (called an MLP) to read the "maps" created by their G-LoG method.

Result: This simple brain, using the G-LoG maps, performed just as well (and sometimes better) than the giant, complex supercomputers that looked at the raw images.
Why it matters: It means you don't always need a massive, expensive supercomputer to diagnose diseases. If you give the computer the right "topological map" (the G-LoG bi-filtration), even a small, simple tool can do a great job.

Summary in a Nutshell

The Goal: Make medical image analysis more accurate and less dependent on massive data.
The Tool: G-LoG Bi-Filtration. It looks at an image through two lenses (smoothness and edges) simultaneously to create a perfect "skeleton" of the data.
The Proof: It's mathematically stable (safe for medical use) and creates better features than looking at the image one way at a time.
The Win: A simple computer program using this tool can beat complex, heavy AI models, making medical diagnosis faster, cheaper, and more reliable.

In short: They didn't build a bigger, stronger engine; they built a better set of headlights so the car can see the road clearly, even with a smaller engine.

1. Problem Statement

Topological Data Analysis (TDA), specifically persistent homology, has shown promise in extracting geometric and topological features from medical images. However, most existing applications rely on single-parameter filtrations (e.g., Vietoris-Rips, lower-star), which often fail to capture complex structural relationships inherent in medical data.

While multi-parameter persistent homology offers a richer representation, constructing effective multi-parameter filtrations directly from images remains a significant challenge. Existing methods often require complex operator selections (like GENEO) or result in "essentially single-parameter" behaviors if the filter functions are independent (i.e., their sublevel sets do not intersect meaningfully). Furthermore, there is a lack of stable, efficient, and accessible bi-filtration methods specifically tailored for volumetric medical imaging that can compete with deep learning baselines without requiring massive labeled datasets.

2. Methodology: G-LoG Bi-Filtration

The authors propose G-LoG (Gaussian-Laplacian of Gaussian), a novel bi-filtration framework designed to generate robust multi-parameter persistence modules from medical images.

Core Concept

The method leverages the complementary nature of two operators:

Gaussian Smoothing ( $\gamma_1$ ): Eliminates noise and captures global intensity structures.
Laplacian of Gaussian (LoG) ( $\gamma_2$ ): Enhances boundaries and detects edges (texture).

By convolving the input image function $\phi$ with both a Gaussian kernel $G$ and the Laplacian of the Gaussian kernel $\Delta G$ , the authors define a bi-parameter filter function $\vec{\gamma}_\phi = (\gamma_1, \gamma_2)$ .

Mathematical Formulation

For an input function $\phi: \mathbb{R}^n \to \mathbb{R}$ :

$\gamma_1(x) = (\phi * G)(x)$ (Gaussian convolution)
$\gamma_2(x) = (\phi * \Delta G)(x)$ (LoG convolution)

The bi-filtration is constructed by taking sublevel sets of these two functions simultaneously: $X_{s,t} = \{x \mid \gamma_1(x) \leq s, \gamma_2(x) \leq t\}$ .

Key Theoretical Contribution: Stability

The paper proves that the interleaving distance between persistence modules generated by G-LoG is stable with respect to the maximum norm of the input functions.

Theorem: $d_I(M(\vec{\gamma}_{\phi_1}), M(\vec{\gamma}_{\phi_2})) \leq C \cdot \|\phi_1 - \phi_2\|_\infty$ .
This ensures that small perturbations in the medical image (e.g., noise) result in bounded changes in the topological features, a crucial property for practical medical applications.

Implementation Pipeline

Preprocessing: Images are converted to grayscale and normalized.
Filtration Construction: Bi-parameter filtrations are generated using the multipers library (for multi-parameter persistence) and GUDHI (for cubical complexes).
Vectorization: The resulting persistence modules are converted into Multi-parameter Persistence Images (MPIs) using Gaussian kernels.
- 2D Images: Concatenation of $H_0$ and $H_1$ images $\to$ 5000-dim vector.
- 3D Volumes: Concatenation of $H_0, H_1, H_2$ images $\to$ 7500-dim vector.
Classification: A simple Multi-Layer Perceptron (MLP) is trained on these topological vectors.

3. Key Contributions

Novel Bi-Filtration: Introduction of G-LoG, a simple yet effective bi-filtration that ensures the intersection of sublevel sets is non-empty, avoiding the "degenerate" case where multi-parameter filtration reduces to single-parameter behavior.
Stability Proof: Theoretical proof that G-LoG persistence modules are stable under the interleaving distance, guaranteeing robustness against image noise.
Efficiency: The method is computationally efficient, generating persistence modules in ~0.1s for 2D images and ~90s for 3D volumes on standard hardware.
Performance with Simple Models: Demonstrates that topological features extracted via G-LoG allow a simple MLP to achieve performance comparable to complex deep learning architectures (ResNet, AutoML) trained on raw pixel data.

4. Experimental Results

The method was evaluated on the MedMNIST (v2) dataset, comprising 12 2D and 6 3D biomedical image classification tasks.

Comparison Baselines: ResNet (18/50), Auto-sklearn, AutoKeras, Google AutoML Vision, and the single-parameter Topo-Med approach.
2D Performance:
- G-LoG significantly outperformed single-parameter Topo-Med (e.g., +41.7% accuracy on ChestMNIST).
- It achieved competitive results against deep learning baselines. For instance, on PathMNIST, it achieved 95.5% AUC (outperforming Auto-sklearn's 93.4%). On ChestMNIST, it reached 94.7% accuracy, matching ResNet-18/50 and trailing only Google AutoML Vision.
- Optimal performance was observed with Gaussian kernel $\sigma = 0.5$ , validating the need for a balanced intersection of sublevel sets.
3D Performance:
- The method outperformed single-parameter approaches and showed strong competitiveness against 3D ResNet variants.
- Notable results: VesselMNIST3D (93.3% AUC, 93.7% ACC) and AdrenalMNIST3D (87.0% AUC, 84.7% ACC).
- On SynapseMNIST3D, the method achieved 82.7% accuracy, surpassing several deep learning baselines.

5. Significance and Future Work

Significance: This work bridges the gap between theoretical multi-parameter TDA and practical medical image analysis. It proves that carefully designed topological features can replace or augment raw pixel data, enabling high-performance classification with lightweight models (MLP) that are more interpretable and require less data than deep neural networks.
Future Directions:
- Extending the framework to three-parameter filtrations to capture even more intricate topological structures.
- Integrating G-LoG into end-to-end deep learning optimization pipelines to allow for joint learning of filtration parameters and classification weights.
- Applying the method to broader domains like computer graphics and other scientific imaging fields.

In summary, the paper presents a robust, theoretically grounded, and empirically successful approach to medical image classification using multi-parameter persistent homology, demonstrating that topological features can rival state-of-the-art deep learning performance.