Comparative Analysis of Patch Attack on VLM-Based Autonomous Driving Architectures

This paper introduces a systematic framework for evaluating black-box patch attacks on three vision-language model-based autonomous driving architectures in CARLA simulation, revealing severe, sustained vulnerabilities and distinct failure patterns that highlight the inadequacy of current designs against physical adversarial threats.

David Fernandez, Pedram MohajerAnsari, Amir Salarpour, Long Cheng, Abolfazl Razi, Mert D. Pesé

Published Wed, 11 Ma
📖 5 min read🧠 Deep dive

Imagine you are teaching a self-driving car how to drive. Instead of just showing it thousands of pictures of roads, you give it a super-smart "brain" that can see the world and talk about it, like a human passenger who says, "Oh look, there's a dog, let's slow down." These are called Vision-Language Models (VLMs). They are the new hot thing in autonomous driving because they seem to understand context and reason better than old-school computer vision.

But here's the scary part: What if someone puts a weird, confusing sticker on a bus stop sign to trick this super-brain?

This paper is a "security test" to see if these new, fancy AI drivers can be fooled by physical stickers (adversarial patches) placed on real-world objects like billboards and bus shelters. The researchers didn't just test one car; they tested three different "brain" designs to see which one is the most gullible.

Here is the breakdown of their findings using some simple analogies:

1. The Setup: The "Trick-or-Treat" Test

The researchers set up two scenarios in a video game simulator (CARLA) that acts like a digital driving school:

  • Scenario A (The Crosswalk): A car is approaching a crosswalk with a pedestrian. The attacker puts a weird, patterned sticker on a nearby bus shelter. The goal? Make the car ignore the pedestrian and speed right through.
  • Scenario B (The Highway): A car is driving fast on a highway. There is a concrete barrier on the right. The attacker puts a huge, weird sticker on a billboard. The goal? Make the car think it's safe to turn right into the concrete wall.

They used a "black-box" method, meaning they didn't know how the cars' brains were built inside. They just threw stickers at them and watched what happened, just like a hacker might in the real world.

2. The Three "Brains" Being Tested

The researchers compared three different ways of building these AI drivers:

  • Dolphins: This model is like a storyteller. It looks at the road and writes a free-flowing story about what it sees. It connects the image and the words very tightly.
  • OmniDrive (Omni-L): This model is like a translator. It takes the image and uses a simple, direct map to turn it into words. It's very structured but maybe a bit rigid.
  • LeapVAD: This model is like a two-person team. One person (the "Fast Thinker") makes quick decisions, while the other (the "Slow Thinker") double-checks the logic. It specifically looks out for dangerous things like people and barriers.

3. The Results: Everyone Got Fooled (But in Different Ways)

The news is not great. All three systems failed miserably. The stickers worked almost 75% of the time, which is a massive increase from their normal error rate of about 4%.

Here is how each "brain" reacted to the trick:

  • Dolphins (The Storyteller) got the most confused.

    • The Analogy: Imagine you are reading a story, and someone puts a smudge on the page that makes you think the hero is a villain. The whole story changes.
    • The Result: When the sticker was there, Dolphins completely forgot the pedestrian existed. It didn't just make a bad decision; it wrote a story saying, "The road is clear!" when a person was standing right there. Its "storytelling" brain was easily corrupted.
  • OmniDrive (The Translator) was consistently weak.

    • The Analogy: Imagine a translator who translates every sentence word-for-word without understanding the context. If you trick the first word, the whole sentence is wrong.
    • The Result: It failed almost equally in both scenarios. Because its translation method is so direct, the sticker messed up its logic every single time, regardless of how close the car was.
  • LeapVAD (The Two-Person Team) was the "best" of the worst.

    • The Analogy: Imagine a security guard who has a specific checklist for "dangerous people." Even if someone tries to trick him with a weird hat, he still checks the checklist.
    • The Result: It was slightly better at spotting the pedestrian (the "Fast Thinker" helped). However, when the trick was about the highway barrier, it still failed. It proved that even a "smart" system with a double-check mechanism can be tricked into thinking a wall is a safe exit.

4. The "Zombie" Effect (Time Matters)

One of the most chilling findings was persistence.
Usually, if a car makes a mistake for one second, a safety system might catch it. But these stickers didn't just cause a one-second glitch. Once the car saw the sticker, it kept making the same wrong decision for 6 to 8 seconds (about 150 meters of driving).

  • The Metaphor: It's like putting a spell on a driver. Once the spell is cast, the driver keeps driving off a cliff for several seconds before waking up. You can't just "wait it out."

5. The Big Takeaway

The paper concludes that current AI drivers are not ready for the real world's tricksters.

  • The "Hallucination" Problem: The cars didn't just drive wrong; they hallucinated. They confidently described a road full of people as "empty" or a wall as an "exit." They didn't just fail to see the danger; they actively lied to themselves about what they saw.
  • No Silver Bullet: There isn't one "perfect" architecture. Some are better at spotting people, others at understanding roads, but all of them can be broken by a simple piece of paper with a weird pattern on it.

In short: We are building self-driving cars with brains that can talk and reason, but we haven't taught them how to ignore a cleverly placed sticker. Until we fix this, these "smart" cars are still very vulnerable to being tricked by a piece of paper on a bus stop.