SFIBA: Spatial-based Full-target Invisible Backdoor Attacks

Imagine you have a very smart robot chef (a Deep Neural Network) that can perfectly identify ingredients like "tomato," "carrot," or "onion." Now, imagine a hacker wants to trick this robot.

In the past, hackers could only teach the robot one trick: "If you see a tiny red dot, call it a 'carrot'." This is a Single-Target Backdoor. It's useful, but limited. If the hacker wants the robot to call a "tomato" a "carrot" later, they have to retrain the whole robot, which is slow and obvious.

This paper introduces a new, super-sneaky trick called SFIBA (Spatial-based Full-target Invisible Backdoor Attack). Think of it as teaching the robot every possible trick at once, without it ever noticing.

Here is how SFIBA works, broken down into simple concepts:

1. The Goal: The "Master Switch"

Instead of teaching the robot just one trick, the hacker wants to create a Master Switch.

Old Way: You can only make the robot think a "dog" is a "cat."
SFIBA Way: You can make the robot think a "dog" is a "cat," a "car," or a "banana," depending on which secret signal you use. And you can do this for every single category the robot knows, all at the same time.

2. The Problem: The "Crowded Room"

The biggest problem with doing this is interference.
Imagine trying to whisper a secret to 100 different people in a crowded room. If you shout all the secrets at once, no one hears anything clearly.

In AI terms, if you try to inject too many "triggers" (secrets) into the training data, they start fighting each other. The robot gets confused, the tricks stop working, or the changes become visible (like a giant red dot on a picture), alerting the defenders.
Also, the hacker doesn't get to see the robot's brain (it's a "Black Box"). They can only change the food (training data) they give the robot, not how the robot thinks.

3. The Solution: "Zoning" and "Invisible Ink"

SFIBA solves this with two main ideas: Spatial Zoning and Frequency Domain Magic.

A. Spatial Zoning (The "Post-it Note" Strategy)

Instead of shouting the whole room, the hacker assigns a specific, tiny zone for each secret.

Imagine the image is a large wall.
For the "Dog-to-Cat" trick, the secret is hidden in the top-left corner.
For the "Dog-to-Car" trick, the secret is hidden in the bottom-right corner.
For the "Dog-to-Banana" trick, it's in the middle.
Why it works: Because the secrets are in different, non-overlapping corners, they don't bump into each other. The robot learns to look at the top-left corner for one trick and the bottom-right for another. This allows the hacker to control every class without the tricks canceling each other out.

B. Frequency Domain Magic (The "Invisible Ink" Strategy)

Now, how do you hide the secret in that corner without the robot (or a human) seeing it?

The Problem: If you just paint a dot on the wall, everyone sees it.
The SFIBA Trick: Instead of painting on the "surface" (pixels), the hacker changes the vibrations of the wall.
- Think of an image like a song. It has a melody (what you see) and a rhythm (the hidden frequencies).
- SFIBA uses a mathematical tool called FFT (Fast Fourier Transform) to turn the image into a song.
- It then uses Wavelets (like a super-precise microscope) to find the specific "notes" in the song that correspond to the secret.
- It tweaks these notes slightly. To your eyes, the song sounds exactly the same. The image looks identical. But to the robot, the "vibration" of that specific corner has changed, triggering the secret command.

4. The "Shape-Shifter" (Morphology Constraints)

To make sure the robot doesn't get confused if the image is rotated or flipped (like a picture of a dog turned sideways), SFIBA gives each secret a unique shape.

The "Dog-to-Cat" secret in the top-left corner is shaped like a horizontal line.
The "Dog-to-Car" secret in the bottom-right is shaped like a vertical line.
Even if the image moves, the robot knows: "Ah, I see a horizontal line in the top-left, so I must call this a cat." This keeps the tricks distinct and robust.

5. The "Dynamic Tuner" (The Volume Knob)

Finally, the system has a smart volume knob.

If the secret is too loud, the robot might notice the image looks weird.
If it's too quiet, the robot won't hear the command.
SFIBA automatically adjusts the "volume" (injection coefficient) for every single image to ensure it's just loud enough to work, but quiet enough to remain invisible. It checks the "quality score" (PSNR) and fine-tunes until it's perfect.

The Result: The Perfect Heist

The paper shows that SFIBA is incredibly effective:

Full Control: It can hijack every class in the robot's brain, not just one.
Invisible: Humans and standard security tools cannot see the difference between a clean image and a poisoned one.
Stealthy: It bypasses current security defenses that try to find backdoors.
Black-Box Friendly: The hacker doesn't need to know how the robot works; they just need to feed it the right "poisoned" food.

In summary: SFIBA is like a master spy who can whisper a different secret to a guard at every single door in a building, using invisible ink and specific hand signals, without ever getting caught or causing a panic. It turns a single-target trick into a full-building takeover.

Here is a detailed technical summary of the paper "SFIBA: Spatial-based Full-target Invisible Backdoor Attacks".

1. Problem Statement

The paper addresses critical limitations in existing Multi-Target Backdoor Attacks on Deep Neural Networks (DNNs), specifically within black-box settings (where the attacker only manipulates the training data without knowledge of the model architecture or parameters).

The Gap: While multi-target attacks allow an attacker to map a single poisoned sample to multiple different target classes, existing methods fail to simultaneously achieve Full-Target capability (attacking all classes) and Stealthiness (visual imperceptibility).
Specific Challenges:
1. Trigger Specificity: In black-box settings, injecting triggers for all classes often causes interference between different backdoors, degrading performance. Existing methods cannot reliably establish unique mappings for every class without model access.
2. Stealthiness: Triggers designed for multiple targets often lack visual imperceptibility, making poisoned samples easy to detect.
3. Black-Box Constraints: Most powerful multi-target attacks (e.g., Marksman) require white-box access or control over the training process, which is unrealistic in many real-world scenarios.

2. Methodology: SFIBA

The authors propose SFIBA (Spatial-based Full-target Invisible Backdoor Attack), a framework that leverages the sensitivity of backdoors to trigger spatial locations and morphologies to achieve full-target attacks in black-box settings.

The methodology consists of three core stages:

A. Theoretical Foundation: Spatial Sensitivity

The authors prove (via Lemma 1 and Neural Tangent Kernel theory) that backdoor models are highly sensitive to the spatial location of triggers. If a trigger is shifted to a non-overlapping region during inference, the backdoor effect vanishes.

Strategy: Divide the image into disjoint local spatial regions (Blocks). Assign a unique Block (and specific RGB channel) to each target class. This ensures that triggers for different classes do not interfere with one another.

B. Local Space Dynamic Invisible Trigger Injection

SFIBA employs a three-step process to inject triggers into specific Blocks:

Block Selection:
- The image is divided into a grid of Blocks.
- To prevent overlap during data augmentation (rotation, translation), an interval is added around each Block.
- A class-specific algorithm maps each target class $t$ to a unique Block $i$ and a specific channel $n$ (Red, Green, or Blue).
Frequency-Domain Poisoning (The Core Injection):
To ensure stealthiness within the small, constrained Block, SFIBA moves from pixel space to the frequency domain:
- FFT (Fast Fourier Transform): Converts the clean Block and the trigger into amplitude and phase spectra. The phase spectrum (high-level semantics) is preserved, while the amplitude spectrum (low-level semantics) is modified.
- DWT (Discrete Wavelet Transform): Applied to the amplitude spectrum to extract features. Specifically, diagonal features ( $HH$ ) are targeted because they contain less energy, minimizing visual distortion.
- SVD (Singular Value Decomposition): Instead of directly overlaying the trigger, the method fuses the trigger's features into the singular values of the clean amplitude spectrum. This reduces the sensitivity of the trigger strength to the injection coefficient, making the attack more robust and adjustable.
- Inverse Transformation: The modified amplitude spectrum is combined with the original phase spectrum and converted back to the pixel domain via Inverse FFT.
Morphological Constraints & Dynamic Optimization:
- Morphology: After injection, DWT is applied again to the poisoned Block. The trigger is restricted to specific morphological patterns (e.g., horizontal vs. vertical distribution) based on the Block's location. This further differentiates triggers for adjacent classes.
- Dynamic Tuning: An algorithm dynamically adjusts the injection coefficient ( $K$ ) based on PSNR (Peak Signal-to-Noise Ratio). It uses a dichotomy method to find the optimal $K$ that maintains high stealthiness (PSNR within a specific range) while ensuring the trigger remains effective.

3. Key Contributions

First Full-Target Black-Box Attack: SFIBA is the first approach to successfully attack all classes in a black-box setting while maintaining trigger stealthiness.
Spatial-Morphological Specificity: Theoretical proof and experimental validation that restricting triggers to disjoint spatial regions and applying unique morphological constraints prevents interference between multiple backdoors.
Frequency-Domain Injection: A novel pipeline combining FFT, DWT, and SVD to inject triggers into small local regions without degrading visual quality or attack success rates.
Robustness: The method is designed to withstand data augmentation and advanced defense mechanisms.

4. Experimental Results

The authors evaluated SFIBA on CIFAR-10, GTSRB, and ImageNet-100 using models like PreActResNet18 and VGG19.

Attack Success Rate (ASR): SFIBA achieved near-perfect ASR (often >99%) across all target classes and datasets, significantly outperforming baselines like One-to-N, Marksman, and Universal Backdoor Attacks (UBA).
Stealthiness:
- Visual Metrics: Achieved high PSNR (>40), SSIM (>0.99), and low LPIPS, indicating the poisoned images are visually indistinguishable from clean ones.
- Comparison: Outperformed baselines in visual quality; baselines often showed visible artifacts or lower ASR.
Benign Accuracy (BA): The model's performance on clean samples remained high, with negligible decrease (DV) compared to the clean model.
Defense Evasion: SFIBA successfully bypassed state-of-the-art defenses, including:
- Fine-Pruning: Pruning neurons did not significantly reduce ASR.
- Neural Cleanse: Anomaly metrics remained below the detection threshold.
- STRIP & EBBA: Entropy and energy score distributions were indistinguishable from benign samples.
Ablation Studies: Removing any component (Dynamic Optimization, Morphology Constraints, SVD, or DWT) resulted in significant drops in either ASR or visual stealthiness, proving the necessity of the full pipeline.

5. Significance

Security Threat: SFIBA demonstrates a severe vulnerability in deep learning supply chains. Attackers can now compromise a model to misclassify inputs into any desired class (e.g., bypassing facial recognition for any employee) without retraining or model access.
Theoretical Insight: The paper provides a theoretical understanding of how spatial sensitivity and frequency-domain manipulation can be leveraged to decouple multiple backdoor triggers, solving the "interference" problem in multi-target attacks.
Defense Implications: The success of SFIBA against existing defenses suggests that current detection methods (which often rely on spatial consistency or frequency analysis) are insufficient against sophisticated, spatially-constrained, frequency-domain attacks. This necessitates the development of new defense paradigms.