Stretching Beyond the Obvious: A Gradient-Free Framework to Unveil the Hidden Landscape of Visual Invariance

Imagine you are trying to understand how a human (or a computer) recognizes a cup. You show them a picture of a red cup, and they say, "That's a cup!" Great. But what if you show them a blue cup? Or a cup made of glass? Or a cup lying on its side? Do they still recognize it?

This ability to recognize the same thing even when it looks slightly different is called invariance. It's the secret sauce that allows us to navigate the world without getting confused every time the lighting changes or an object moves.

For a long time, scientists trying to understand this in artificial brains (like AI) had a blind spot. They could find the "perfect" image that made a specific AI neuron fire like crazy (like a red cup). But they couldn't easily map out the entire range of images that the neuron would accept. They didn't know the boundaries of the neuron's "tolerance."

Enter SnS (Stretch-and-Squeeze). Think of it as a new, super-smart detective tool that doesn't need to peek inside the AI's brain to figure out how it thinks.

The Core Idea: Stretching and Squeezing

Imagine you have a rubber band.

Stretching: You pull the rubber band as far as you can without breaking it.
Squeezing: You compress it as much as you can without it snapping back.

SnS uses these two actions to test an AI:

The "Stretch" (Finding Invariance):
- Goal: Find an image that looks completely different from the original cup (maybe it's now a green, metallic, upside-down cup), but the AI still thinks, "Yes, that is definitely a cup!"
- The Metaphor: Imagine you are stretching a piece of clay. You want to pull it into a weird, alien shape, but you must keep the "cup-ness" inside it intact. SnS does this by mathematically "stretching" the image's features until they are as far away as possible from the original, while "squeezing" the AI's reaction to stay exactly the same.
- Result: This reveals the true limits of the AI's understanding. It finds weird, creative variations of a cup that the AI accepts, which standard tests (like just rotating the image) would miss.
The "Squeeze" (Finding Weaknesses/Adversarial Attacks):
- Goal: Find an image that looks almost exactly like the original cup, but tricks the AI into thinking it's a toaster.
- The Metaphor: You take a perfect cup and make tiny, invisible tweaks to the pixels (squeezing the distance between the images to zero) until the AI's brain snaps and says, "Wait, that's a toaster!"
- Result: This shows where the AI is fragile and easily fooled.

Why is this special?

Most previous tools were like trying to open a locked door by guessing the key (using gradients). If the door is locked (or the AI is a "black box" where you can't see the keys), those tools fail.

SnS is different. It's gradient-free.

The Analogy: Imagine you are in a dark room trying to find the exit.
- Old methods: You need a map and a flashlight (gradients) to see the path. If the map is missing, you're stuck.
- SnS: You just start walking in different directions, testing if you hit a wall or find the door. You don't need to see the whole room; you just need to know if your feet are moving you closer to the goal. This makes SnS work on any system, even biological brains (like a monkey's visual cortex) where we can't see the "code."

What Did They Discover?

The researchers used SnS on different layers of an AI brain (ResNet50) and found some fascinating things:

Different Layers, Different Rules:
- If you stretch the bottom layer (pixels), the AI accepts cups that just have different colors or brightness.
- If you stretch the middle layer, the AI accepts cups with different textures (like a fuzzy cup vs. a shiny cup).
- If you stretch the top layer, the AI accepts cups that are in totally different poses or even have other objects mixed in.
- Takeaway: The AI builds its understanding of "cup-ness" step-by-step, and SnS maps out exactly what each step cares about.
The "Robust" AI Paradox:
- Scientists have trained "Robust" AIs to be harder to fool. These AIs usually look more like humans.
- The Twist: When SnS tested these Robust AIs, it found that while they were great at recognizing simple changes (like color), they actually became worse at understanding complex, high-level changes (like a cup being held upside down) compared to normal AIs.
- The Metaphor: It's like a student who memorized the dictionary perfectly (Robust AI) but struggles to understand a joke (high-level invariance), whereas a normal student (Standard AI) might get the joke but miss the spelling. SnS revealed that making AI "robust" didn't make it "human-like" in the way we hoped.

Why Should You Care?

For AI Safety: It helps us find the weird, hidden ways AI can be tricked, making our self-driving cars and medical scanners safer.
For Neuroscience: It allows scientists to study how real animal brains work without needing a perfect computer model of the brain first. It's like being able to interview a witness without needing to know their entire life story first.
For Understanding Intelligence: It shows us that "recognizing a cup" isn't just one thing; it's a complex, layered dance of features. SnS helps us see the whole dance, not just the first step.

In short, SnS is a new magnifying glass that lets us see the invisible boundaries of how machines (and brains) see the world, revealing that the "rules" for recognition are much more complex and interesting than we thought.

1. Problem Statement

Understanding how visual systems (both biological and artificial) transform images into representations that support object recognition requires identifying the specific combinations of features a unit encodes.

Limitation of Current Methods: Traditional feature visualization techniques focus on finding "Most Exciting Images" (MEIs)—inputs that maximally activate a unit. However, MEIs only reveal a few instances within a vast set of activating images. They fail to characterize the manifold of transformations (invariance) under which a unit's response remains stable.
The Gap: Existing approaches often rely on predefined transformations (e.g., affine rotations) or gradient-based optimization (requiring access to model weights). These methods do not systematically explore the full boundaries of a unit's invariance or its vulnerability to adversarial perturbations, particularly in "black-box" scenarios (e.g., biological neurons where gradients are unavailable).

2. Methodology: Stretch-and-Squeeze (SnS)

The authors introduce Stretch-and-Squeeze (SnS), a model-agnostic, gradient-free framework designed to systematically characterize a unit's maximally invariant stimuli and adversarial vulnerabilities.

Core Algorithm

SnS frames the search for invariance and adversarial examples as a bi-objective optimization problem solved using the Covariance Matrix Adaptation Evolutionary Strategy (CMA-ES). It utilizes a generative model ( $\psi$ ) to map latent codes to images and a test network ( $\phi$ ) to measure activations.

The optimization involves two competing objectives defined by layer indices $\kappa$ (input/intermediate representation) and $\ell$ (target unit layer):

Stretch (Maximize Distance): Maximize the Euclidean distance between the representation of the generated image and a reference image (e.g., an MEI) at layer $\kappa$ .
Squeeze (Minimize Change): Minimize the change in the activation of the target unit at layer $\ell$ relative to the reference.

Dual Modes of Operation

Probing Invariance ( $\Xi_{inv}$ ):
- Goal: Find images that look very different from the reference in representation space (stretch) but still activate the target unit strongly (squeeze).
- Objective: Maximize $L_{stretch}$ at layer $\kappa$ while minimizing $L_{squeeze}$ at layer $\ell$ .
Probing Adversarial Sensitivity ( $\Xi_{adv}$ ):
- Goal: Find images that look very similar to the reference in representation space (squeeze) but cause the target unit to fail (silence).
- Objective: Minimize $L_{stretch}$ at layer $\kappa$ while maximizing the deviation in activation at layer $\ell$ .

Experimental Setup

Models: Applied to ResNet50 (Standard and L2-robust), ResNet18, VGG16, and Vision Transformers (ViT).
Reference Stimuli: Used MEIs (generated via XDREAM) or natural images.
Hierarchical Stages: Optimizations were performed by stretching representations at three levels:
1. Low-level: Pixel space ( $\kappa=0$ ).
2. Mid-level: Intermediate convolutional layers (e.g., Layer 3).
3. High-level: Deep convolutional layers (e.g., Layer 4).
Interpretability Testing: Generated invariant images were classified by human subjects (via 12-alternative forced choice) and other "observer" neural networks (standard and robust) to measure perceptual alignment.

3. Key Contributions

Gradient-Free Invariance Mapping: SnS is the first gradient-free approach to systematically infer invariance manifolds for both artificial and biological visual units, making it applicable to "black-box" systems where gradients are inaccessible.
Bi-Objective Formulation: It unifies the search for invariant images and adversarial examples into a single Pareto-optimal framework, allowing for the discovery of the boundaries of invariance rather than just local neighborhoods.
Hierarchical Characterization: The method reveals that invariance is not uniform; it changes qualitatively and quantitatively depending on which layer of the network is used as the "stretch" point.
Robustness to Subsampling: The framework was validated to work effectively even when the representation space is heavily subsampled (simulating sparse biological recordings), a critical feature for neurophysiology applications.

4. Key Results

A. Effectiveness of Generated Images

Adversarial Examples: SnS generated adversarial images that suppressed unit activation by ~111% relative to MEIs with a mean L2 pixel distance of 72 pixels. These were semantically relevant, not noise-like.
Invariant Images: SnS discovered images that were significantly more distinct from the reference (mean L2 distance ~271 pixels) than standard affine augmentations, yet maintained high unit activation (only ~34% drop).

B. Layer-Specific Invariance Manifolds

The nature of the discovered invariances depends on the layer where the "stretch" is applied:

Pixel Space Stretch: Primarily altered luminance and contrast.
Mid-Level Stretch: Primarily altered texture and color.
High-Level Stretch: Primarily altered pose, viewpoint, and object instances.
Dimensionality: The intrinsic dimension (ID) of these manifolds followed a non-monotonic trend (lowest at pixel level, highest at mid-level, lower again at deep levels), mirroring trends in biological visual cortex.

C. Interpretability: Standard vs. Robust Networks

A critical finding concerns the alignment between Artificial Neural Networks (ANNs) and human perception:

L2-Robust Networks: Invariant images generated by stretching low-level (pixel) representations were highly interpretable by humans. However, as the stretching moved to deeper layers, human interpretability dropped significantly.
Standard Networks: Conversely, invariant images from deep layers were more interpretable by humans than those from pixel space.
The Divergence: While L2-robust training improves pixel-level alignment with humans, it fails to increase the interpretability of high-level invariances. In fact, the "robustness advantage" erodes in deep layers, where standard networks surprisingly align better with human perception regarding high-level transformations.

D. Generalizability

Vision Transformers (ViT): SnS revealed that ViTs learn less strictly hierarchical invariances compared to CNNs; mid- and high-level invariances were similarly interpretable.
Biological Applicability: The method successfully identified invariances even when using only 0.5%–4% of the neurons in a layer, validating its potential for in vivo neuroscience experiments.

5. Significance and Implications

Neuroscience Tool: SnS provides a powerful tool for neurophysiologists to map the tuning properties of biological neurons without needing a perfect "digital twin" or access to gradients. It can uncover complex, non-linear invariance manifolds that traditional methods miss.
Redefining Robustness: The study challenges the assumption that adversarial training fully aligns AI with human vision. While robust models align better at the pixel level, they develop "idiosyncratic" high-level invariances that are less human-interpretable than those of standard models.
Beyond Metamers: Unlike "metamers" (which minimize distance to a reference in a specific layer), SnS maximizes distance, exploring the extremes of the invariance manifold. This reveals that robust and standard networks have fundamentally different geometric structures in their high-level representations.
Future Directions: The framework suggests new training strategies (e.g., "visual diets" using human-interpretable invariant images) to improve the alignment of deep learning models with biological vision.

In summary, SnS moves beyond finding "what excites a neuron" to mapping "what a neuron tolerates," revealing a complex, hierarchical landscape of visual invariance that differs fundamentally between standard and adversarially robust models.