Understanding Neural Network Systems for Image Analysis… — Plain-Language Explanation

Imagine you have a very smart, but somewhat mysterious, robot chef. This robot is amazing at looking at pictures of food and telling you exactly what dish it is (like "That's a pizza!" or "That's a taco!"). However, nobody really knows how it decides. It just gives an answer, and we have to trust it.

This paper is like a guidebook for opening up the robot's kitchen to see exactly how it thinks. The authors, Rebecca and Marios Pattichis, use a set of mathematical tools called Linear Algebra (think of it as the "geometry of data") to turn the robot's brain into something we can actually see and understand.

Here is a simple breakdown of their ideas using everyday analogies:

1. The Four "Rooms" in the Robot's Brain

The authors say that every layer of the neural network (a step in the robot's thinking process) can be understood by looking at four specific "rooms" or spaces where the image data lives.

The Signal Room (What the robot cares about): Imagine the robot is looking at a picture of a cat. The "Signal Room" contains all the parts of that picture the robot actually pays attention to—like the ears, the whiskers, and the tail. These are the features that help it say "Cat."
The Signal Output Room (The result): This is where the robot sends the "cat" information after it has processed it. It's the final message the robot sends to the next step in the chain.
The Rejected Signal Room (The trash can): This is the most interesting part. It contains everything the robot ignores. If you showed the robot a picture of a cat with a weird, invisible background pattern, that pattern would end up in the "Rejected Signal Room." The robot looks at it, decides it doesn't matter, and throws it away. The paper shows us exactly what gets thrown away at every step.
The Rejected Output Room: This is the part of the final answer that the robot couldn't produce, even if it tried. It's like the "dead ends" in the robot's logic.

The Big Idea: By looking at these rooms, we can see exactly what information is being kept and what is being deleted as the image moves through the network.

2. The "Sieve" Analogy

Think of a neural network layer like a colander (a kitchen strainer) with specific holes.

The Signal: The water (the important parts of the image) flows through the holes.
The Residual (Trash): The pasta (the unimportant parts) gets stuck in the colander.

The authors show us how to look at the "holes" in the colander (the weights) to see what shape of pasta gets stuck. For example, in a simple network, they found that the "holes" were shaped to catch bright and dark spots, effectively turning the image into a high-contrast black-and-white sketch. In a more complex network (ResNet), the holes were shaped to catch specific lines and edges, like vertical lines or diagonals.

3. The "Reverse Engineering" Magic

One of the coolest parts of the paper is about Invertible Networks.

Usually, if you ask a robot, "What does a '9' look like?", it can tell you. But if you ask, "Show me a picture that makes you think '9'!", it's usually hard to get a clear answer.

The authors used their math to run the robot backward.

The Analogy: Imagine you have a smoothie. Usually, you can't turn a smoothie back into a strawberry and a banana. But these authors found a way to "un-blend" the smoothie.
How they did it: They started with the robot's "perfect answer" (e.g., a 100% confident "This is a 9") and worked backward through the layers. Because they used special mathematical rules, they could reconstruct the exact image that would make the robot say "9."
The Result: They generated images that looked like the "ideal" version of a number. For simple networks, these images looked like clear, high-contrast drawings. For complex networks, the images were a bit blurry or looked like binary code (just black and white dots), showing that the complex robot had a very specific, rigid way of seeing the world.

4. Why Does This Matter?

In the past, we treated neural networks like "Black Boxes." We put a picture in, got an answer out, and had no idea what happened in between.

This paper gives us a X-Ray machine for AI.

It helps us see if the robot is learning the right things (like the shape of a digit) or the wrong things (like the background color).
It helps us see what information is being lost.
It allows us to "reverse engineer" the robot to see what it considers the "perfect" example of anything.

In a nutshell: The authors took the scary math behind AI and turned it into a visual map. They showed us that AI doesn't just "guess"; it systematically filters out the noise and keeps the signal, and now we have a way to watch that filtering process happen in real-time.

1. Problem Statement

Despite the high performance of neural networks in image analysis, there is a significant lack of interpretability regarding how specific layers transform input images and which image representations are captured or discarded. As models grow in size and are deployed in critical applications (e.g., biomedical imaging), the need for interpretable models has become urgent. Existing visualization methods (such as saliency maps or filter visualization) often lack a rigorous mathematical framework to explain what information is lost or retained at each layer.

2. Methodology

The authors propose a framework based on Linear Algebra and Vector Spaces to model neural network layers as maps between signal spaces. The core methodology involves the following components:

A. The Four Fundamental Signal Spaces

The paper models a neural network layer (ignoring bias for simplicity) as $y = Wx$, where $W$ is the weight matrix and $x$ is the flattened input vector. The authors decompose the input and output spaces using the four fundamental subspaces associated with $W$ :

Signal Space ($Signal(W)$): Defined as the Row Space of $W$ ($RowSpace(W)$). This represents the components of the input image that the layer interprets as "signal."
Signal Output Space ($SignalOut(W)$): Defined as the Column Space of $W$ ($ColumnSpace(W)$). This represents the set of reachable output images.
Rejected Signal Space ($RejSignal(W)$): Defined as the Null Space of $W$ ($NullSpace(W)$). This represents input components that are completely ignored by the layer (mapped to zero).
Rejected Output Space ($RejSignalOut(W)$): Defined as the Left Null Space of $W$ ($LeftNullSpace(W)$).

The input space is decomposed as $R^n = Signal(W) \oplus RejSignal(W)$ , and the output space as $R^m = SignalOut(W) \oplus RejSignalOut(W)$ .

B. Visualization via Projections and SVD

Weight Vectors: For a single neuron with weight vector $w$ , the input $x$ is projected onto $w$ . The residual ( $x - p$ ) represents the signal component removed by that neuron. The energy of the residual ( $\|residual\|^2 / \|x\|^2$ ) quantifies how much information is discarded.
Weight Matrices: The authors use Singular Value Decomposition (SVD) ( $W = U\Sigma V^T$ ) to visualize the signal space. The singular vectors ( $v_i$ ) represent the basis of the signal space, and singular values ( $\sigma_i$ ) indicate their relative importance.
Convolutional Layers: The method is adapted for CNNs by treating flattened convolution kernels as rows of the weight matrix, allowing the visualization of kernel support and directional selectivity.

C. Invertible Networks and Input Generation

The paper explores Invertible Neural Networks (INNs) to compute input images that yield specific outputs.

Invertible Activation: Using fully invertible activation functions (e.g., SELU, tanh), the input signal can be recovered via $x_{Signal} = W^+ f^{-1}(Out)$ .
Optimization Approach: For general networks (including non-invertible ones), the authors define "ideal outputs" (e.g., max value for a target class, min for others) and generate input images by minimizing the distance to these ideals. They utilize strategies like averaging training images ($avg-img$), selecting the closest training image ($min-img$), or averaging the top 25% closest images ($avg-min-img$).

3. Key Contributions

Mathematical Framework for Interpretability: Introduces a rigorous method to visualize neural network layers using the four fundamental vector spaces, distinguishing between "signal" (retained) and "residual" (rejected) components.
Visualization of Information Loss: Demonstrates how to visualize the specific image components removed at each layer by analyzing the residual space, providing insight into what the network "ignores."
Invertibility Analysis: Applies vector space theory to invertible networks to back-project output vectors to input images, offering a new perspective on network reconstruction.
Empirical Validation: Successfully applies these techniques to both simple Fully Connected Neural Networks (FCNN) and complex architectures like ResNet18.

4. Results

The authors evaluated their approach on the MNIST dataset using three architectures: a 1-layer FCNN, a 5-layer FCNN, and ResNet18.

1-Layer FCNN:
- Signal Space: Visualized via SVD showed decreasing importance of signal vectors (e.g., $\sigma_0 v_0$ was bright and clear, while $\sigma_9 v_9$ was noise).
- Residuals: For digits like '8' and '0', the residual images clearly showed the digit shape, indicating the network successfully removed these features. For other digits, residuals contained strong signal components, attributed to a lack of translational invariance in the simple network.
ResNet18 (First Conv Layer):
- Condition Number: The layer had a condition number of 1.07, indicating stable decomposition and equal importance among signal kernels.
- Kernel Visualization: The signal vectors revealed strong directional selectivity (e.g., vertical dominance, diagonal dominance, single-pixel dominance), effectively visualizing what the convolutional kernels detect.
Input Generation:
- For low-complexity networks (FCNNs), initializing with training data averages or minima and fine-tuning produced clear, recognizable digit images.
- For ResNet18, the generated images were either binarized or blurry, suggesting that the complex architecture's non-linearities and depth make simple vector space inversion difficult without specialized invertible layers.

5. Significance and Conclusion

This paper bridges the gap between abstract linear algebra and deep learning interpretability. By framing neural network operations as projections and decompositions in vector spaces, the authors provide a tool to:

Quantify Information Loss: Explicitly measure how much image energy is discarded by a layer.
Visualize Feature Extraction: See exactly which spatial patterns (signals) are preserved and which are rejected.
Enable Inversion: Offer a pathway to reconstruct inputs from outputs, particularly in invertible networks.

The authors conclude that while invertible networks allow for easy back-projection, future work should investigate whether these networks can match the performance of standard non-invertible networks while maintaining this high level of interpretability. The proposed vector space approach offers a fundamental, mathematically grounded alternative to heuristic visualization methods.

Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps