VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments

This paper introduces VANGUARD, a lightweight geometric perception tool that enables LLM-based UAV agents operating in GPS-denied environments to accurately estimate Ground Sample Distance and recover metric scale by leveraging detected vehicles as environmental anchors, thereby significantly reducing spatial hallucinations and catastrophic failures compared to state-of-the-art vision-language models.

Yifei Chen, Xupeng Chen, Feng Wang, Niangang Jiao, Jiayin Liu

Published 2026-03-05
📖 4 min read☕ Coffee break read

Imagine you are flying a drone over a city, but you've lost your GPS signal and your camera's "ID card" (metadata) is missing. You can see the world below, but you have no idea how big anything actually is. Is that parking lot 10 feet wide or 100 feet? Is that car 15 feet long or 150 feet?

Without this "size sense," your drone is like a person trying to catch a ball in the dark—they might guess, but they'll likely drop it.

This paper introduces VANGUARD, a smart tool designed to give drones their "size sense" back, even in the dark.

The Problem: The "Guessing Game" of AI

Recently, engineers started using super-smart AI models (called VLMs or LLMs) to help drones plan their missions. These AIs are great at understanding language and recognizing objects. "I see a swimming pool," they might say.

But when asked, "How big is it?" these AIs start hallucinating.

The researchers tested five of the smartest AI models available. When asked to measure the area of a field just by looking at a picture, they were wrong 50% of the time. Sometimes they thought a small pond was the size of a lake, or a tiny car was the size of a bus. In the real world, if a drone thinks a landing spot is huge when it's actually tiny, it could crash. This is called Spatial Scale Hallucination.

The Solution: VANGUARD (The "Car Ruler")

Instead of asking the AI to "guess" the size, the researchers built a simple, math-based tool called VANGUARD.

Here is the clever trick: Cars.

No matter where you are in the world (mostly in cities), cars are roughly the same size. A standard sedan is about 16 to 17 feet (5 meters) long.

VANGUARD works like this:

  1. Spot the Cars: The drone takes a picture and uses a detector to find all the cars. It draws a box around them.
  2. Count the Pixels: It measures how many pixels long those cars are in the image.
  3. Do the Math: If the AI sees a car that is 100 pixels long, and it knows a real car is 5 meters long, it can instantly calculate: "1 pixel = 0.05 meters."
  4. The "Ruler" is Born: Now the drone has a ruler. It can measure anything else in the picture (a field, a building, a pool) by counting pixels and multiplying by that ruler.

Why is this better than the "Smart" AI?

The researchers compared their method to the "guessing" AI models.

  • The Guessing AI: Tries to use its "common sense" to estimate size. It fails because it doesn't truly understand physics; it just predicts what words usually go together.
  • VANGUARD: Uses deterministic math. It doesn't guess; it calculates. It's like using a tape measure instead of guessing how long a rope is by looking at it.

The Results:

  • VANGUARD was wrong by only 6.87% on average.
  • The Guessing AI was wrong by 38% to 52% on average.
  • When measuring a swimming pool, the Guessing AI thought it was 50% smaller than it really was. VANGUARD got it within 1.5% of the real size.

The "Safety Switch"

VANGUARD is also smart about when not to trust itself.

  • If the image is too blurry (cars look like tiny dots), VANGUARD says, "I can't measure this accurately."
  • It gives the drone a confidence score. If the score is low, the drone knows, "Okay, I shouldn't try to land here based on this measurement. I'll wait for better data or use a different strategy."

The Big Picture

This paper teaches us a valuable lesson for the future of robotics: Don't rely on a general "smart" brain to do everything.

Instead, give the robot a specialized tool for specific jobs.

  • Let the "Smart Brain" (LLM) decide where to go and what to do.
  • Let the "Math Tool" (VANGUARD) handle the exact measurements.

By combining the two, we get a drone that is both smart enough to plan a mission and precise enough to execute it safely without crashing into things it thinks are bigger or smaller than they really are.

In short: VANGUARD turns a drone with a broken ruler into one with a perfect, invisible tape measure, using nothing but the cars parked on the street.