Adversarial Robustness of Deep Learning-Based Thyroid Nodule Segmentation in Ultrasound

Imagine you have a very smart, automated robot doctor that looks at ultrasound pictures of a thyroid gland. Its job is to draw a perfect line around any lumps (nodules) it sees, helping real doctors decide if a patient needs treatment. This robot is powered by "Deep Learning," a type of artificial intelligence that gets really good at recognizing patterns after studying thousands of pictures.

However, this paper asks a scary question: What if someone tricks the robot?

Just like you can trick a human by wearing a disguise, a hacker can add tiny, invisible "glitches" to an ultrasound image. To the human eye, the picture looks exactly the same. But to the robot, these glitches are like a secret code that makes it draw the wrong line, miss the lump entirely, or draw a lump where there isn't one. This is called an adversarial attack.

The researchers wanted to see if they could trick this thyroid robot, and more importantly, if they could build a "shield" to stop the tricks.

The Two Ways They Tricked the Robot

The team invented two different types of "disguises" to fool the AI, using the unique nature of ultrasound images (which look like grainy static rather than clear photos):

The "Static Noise" Trick (SSAA):
- The Analogy: Imagine the ultrasound image is a grainy black-and-white photo. This attack adds a tiny bit of extra "static" (like TV snow) right along the edge of the lump.
- The Effect: It's like whispering a lie right next to the robot's ear. The robot gets confused about exactly where the edge of the lump is. The image still looks normal to a human, but the robot's drawing becomes messy and inaccurate.
- Result: This trick was very effective. It broke the robot's accuracy significantly.
The "Frequency Hacking" Trick (FDUA):
- The Analogy: Every image is made of different "frequencies" (like different musical notes). Some notes make up the texture of the skin; others make up the lump. This attack messes with the specific "notes" (frequencies) that make up the texture of the tissue.
- The Effect: It's like changing the pitch of a song slightly so it sounds like a different song to a robot, even though a human listener can't tell the difference.
- Result: This trick also confused the robot, but in a different way. It made the robot draw the lump too small or miss parts of it.

The Shields: Can We Fix It?

Once the robot was tricked, the researchers tried three different "shields" to clean up the image before the robot looked at it. Think of these as different ways to wash a dirty window before looking through it.

The "Blur and Shuffle" Shield (Randomized Preprocessing):
- How it works: Before the robot looks at the image, the computer quickly blurs it, shrinks it, and stretches it a few times, then averages the results.
- The Analogy: It's like shaking a snow globe to settle the snow, then looking at it from different angles to get a clear picture.
- Did it work? Yes, against the "Static Noise" trick. It helped the robot recover about 29% of its lost accuracy. But it failed completely against the "Frequency Hacking" trick.
The "Denoising" Shield (Deterministic Input Denoising):
- How it works: This is a simple filter that smooths out the grainy static in the image, like using a photo editor to remove noise.
- The Analogy: It's like wiping a foggy windshield with a cloth.
- Did it work? This was the best shield against the "Static Noise" trick, recovering 36% of the lost accuracy. However, just like the first shield, it couldn't stop the "Frequency Hacking" trick at all.
The "Group Vote" Shield (Stochastic Ensemble):
- How it works: The computer creates five slightly different versions of the image, asks the robot to draw the lump on all five, and then takes a "majority vote" on the final drawing.
- The Analogy: It's like asking five different people to draw a map and then combining their best parts to get the most accurate one.
- Did it work? It helped a little bit with the "Static Noise" trick (recovering 28%), but failed against the "Frequency Hacking" trick.

The Big Takeaway

The most important lesson from this paper is that not all tricks are created equal.

The "Static" tricks (messing with the grainy texture) were like a simple disguise. The robot's "shields" (cleaning the image) could see through them and fix the problem.
The "Frequency" tricks (messing with the underlying structure of the image) were like a master-level disguise. The shields couldn't tell the difference between the trick and the real image, so they couldn't fix it.

Why This Matters

This study shows that while we are building amazing AI doctors, they are still vulnerable to clever hackers. If a hospital uses an AI to scan for thyroid nodules, a hacker could potentially send a "poisoned" image that looks normal but causes the AI to miss a cancerous lump.

The good news is that simple cleaning tools can stop some of these attacks. The bad news is that the most sophisticated attacks (the frequency ones) are currently very hard to stop with simple tools. The researchers conclude that we need to build smarter robots that are trained to recognize these tricky frequency patterns from the start, rather than just trying to clean the image afterwards.

In short: AI is getting better at medicine, but it's also getting easier to trick. We need to keep building better shields, because the hackers are already thinking outside the box.

1. Problem Statement

Deep learning models are increasingly used for automated segmentation of thyroid nodules in B-mode ultrasound to aid diagnosis and treatment planning. However, their reliability is threatened by adversarial attacks—small, often imperceptible perturbations to input images that cause significant prediction failures.

The Gap: While adversarial robustness is well-studied in natural image classification, it remains underexplored in medical image segmentation, particularly for ultrasound.
Specific Challenges: Ultrasound images possess unique properties (multiplicative speckle noise, specific frequency-domain structures, and high inter-frame variability) that may render generic perturbation methods ineffective or require specialized attack strategies.
Threat Model: The study focuses on black-box attacks, where the attacker has no access to model weights or gradients, only input/output pairs. This reflects realistic clinical deployment scenarios (e.g., via inference APIs).
Defense Gap: There is a lack of evaluation regarding lightweight, inference-time defenses (preprocessing or aggregation) against ultrasound-specific attacks, as opposed to computationally expensive retraining (adversarial training).

2. Methodology

Dataset and Model

Data: Stanford AIMI Thyroid Ultrasound Cine-clip database (192 biopsy-confirmed nodules, 167 patients). All 17,412 frames were used.
Split: Patient-level split (70% train, 15% validation, 15% test) to prevent data leakage.
Model: A standard U-Net architecture (4-stage encoder, 512-feature bottleneck, symmetric decoder) trained with AdamW and a combined Binary Cross-Entropy + Dice loss.

Adversarial Attacks (Black-Box)

The authors developed two novel attacks tailored to ultrasound physics, operating under a fixed query budget (500 queries per clip):

Structured Speckle Amplification Attack (SSAA):
- Mechanism: Exploits the multiplicative speckle noise inherent in ultrasound. It injects spatially structured noise concentrated near the predicted segmentation boundary.
- Technique: Uses a Gaussian weighting function to target boundaries and applies Rayleigh-distributed noise (mimicking speckle) multiplicatively to the image.
- Goal: Maximize Dice Similarity Coefficient (DSC) reduction while maintaining high visual similarity.
Frequency-Domain Ultrasound Attack (FDUA):
- Mechanism: Targets the frequency domain where tissue texture information is concentrated.
- Technique: Applies bandpass-filtered phase perturbations in the 2D Fourier domain. It isolates mid-frequency components (4%–80% of Nyquist frequency) and adds random phase noise scaled by an amplitude parameter.
- Goal: Disrupt texture-based segmentation features without obvious spatial artifacts.

Inference-Time Defenses

Three lightweight strategies were evaluated without retraining the model:

Randomized Preprocessing with Test-Time Augmentation (TTA): Applies random spatial rescaling and Gaussian blur to the input, averages the probability maps of $K=5$ transformations, and thresholds the result.
Deterministic Input Denoising: A fixed pipeline applying Gaussian blur ( $\sigma=1.0$ ) followed by a $3\times3$ median filter to remove high-frequency perturbations.
Stochastic Ensemble with Consistency-Aware Aggregation: Generates $K=5$ augmented copies (shifts, rescaling, blur, noise, brightness), runs them through the model, and computes a weighted average based on pixel-wise agreement (consensus) before thresholding.

Evaluation Metrics

Segmentation Quality: Dice Similarity Coefficient (DSC), Intersection over Union (IoU), Precision, Recall, Pixel Accuracy, and 95th percentile Hausdorff Distance (HD95).
Imperceptibility: Structural Similarity Index (SSIM), L2 norm, and L-infinity norm.
Recovery Rate: The fraction of attack-induced DSC loss restored by the defense.

3. Key Results

Baseline Performance

The undefended U-Net achieved a mean DSC of 0.76 (SD 0.20) on unperturbed images.

Attack Effectiveness

SSAA (Spatial): Caused a severe drop in performance, reducing mean DSC from 0.76 to 0.47 ( $\Delta$ -0.29). It maintained high visual fidelity (SSIM = 0.94) but introduced both false negatives and false positives.
FDUA (Frequency): Caused a moderate drop, reducing mean DSC from 0.76 to 0.65 ( $\Delta$ -0.11). It had lower visual fidelity (SSIM = 0.82) and primarily caused under-segmentation (missed tissue).

Defense Efficacy

Against SSAA (Spatial Attack):
- All three defenses significantly improved DSC ( $p < 0.001$ ).
- Deterministic Denoising performed best, recovering +0.10 DSC (35.57% recovery rate), bringing the score to 0.57.
- Randomized Preprocessing and Stochastic Ensemble recovered +0.09 and +0.08, respectively.
- Mechanism: Improvements were driven largely by increased Recall (recovering missed nodule tissue), though Precision remained low.
Against FDUA (Frequency Attack):
- No defense achieved a statistically significant improvement in DSC.
- While some defenses improved Recall (e.g., Denoising improved recall by +0.07), this was offset by a decrease in Precision (over-segmentation), resulting in negligible net DSC gain.
- The attacks proved robust to input-level preprocessing, suggesting frequency-domain perturbations are harder to filter without distorting the image.

Performance Cost

All defenses imposed negligible cost on unperturbed images, with DSC changes ranging from -0.0005 to +0.0031.

4. Key Contributions

Novel Attack Strategies: Introduced SSAA and FDUA, two black-box attacks specifically designed to exploit the physical properties of ultrasound (speckle noise and frequency-domain texture).
Comprehensive Defense Evaluation: Systematically evaluated three inference-time defenses against these specialized attacks, demonstrating that defense efficacy is highly dependent on the domain of the perturbation (spatial vs. frequency).
Modality-Specific Insights: Revealed an asymmetry in defensibility: spatial-domain attacks (SSAA) are partially mitigated by standard preprocessing, whereas frequency-domain attacks (FDUA) resist simple input-level defenses.
Clinical Relevance: Demonstrated that even with limited query budgets, black-box attacks can cause clinically significant segmentation failures (DSC drop > 0.25) that are visually imperceptible.

5. Significance and Conclusion

Vulnerability Confirmed: Deep learning models for thyroid ultrasound segmentation are vulnerable to black-box adversarial attacks that can degrade performance significantly without obvious visual cues.
Defense Limitations: Simple inference-time preprocessing (denoising, augmentation) is insufficient for protecting against frequency-domain attacks. This suggests that current "lightweight" defenses may not generalize across all perturbation types in medical imaging.
Future Directions: The study argues for:
- Training-time strategies: Incorporating frequency-domain and speckle-like perturbations during model training.
- Architectural diversity: Using ensembles of diverse models rather than just augmented copies of a single network.
- Confidence-based triage: Implementing mechanisms to flag low-consensus predictions for human review.
Conclusion: Adversarial robustness in medical imaging must be evaluated in a modality-specific and attack-specific manner. Relying on a single class of defense is insufficient for ensuring safety in clinical deployment.