Detection and Measurement of Hailstones with Multimodal Large Language Models

Imagine you are a meteorologist trying to figure out how big a hailstone is. In the old days, you'd have to wait for a storm to pass, then send a team of people out with rulers and "hail pads" (special mats) to catch the ice. It's like trying to measure the size of a tsunami by sticking a single ruler in the ocean at one spot—you miss most of the action, and you only know what happened right where you stood.

This new paper is about a smarter, faster way to do this: asking Artificial Intelligence to look at photos people post on social media and guess the size of the hail.

Here is the story of how they did it, broken down into simple parts:

1. The Problem: The "Blind Spot"

Hailstorms are expensive and dangerous. They smash cars and destroy crops. But our current weather sensors are sparse; they are like a few scattered streetlights in a huge, dark city. They miss a lot of the hail. Meanwhile, millions of people are snapping photos of the destruction and posting them on Instagram, X (Twitter), and Facebook. These photos are a goldmine of data, but they are messy. One photo might show a giant hailstone next to a hand; another might show tiny hail from far away with no reference.

2. The Solution: The "Super-Intelligent Detective"

The researchers didn't build a new robot from scratch. Instead, they used Multimodal Large Language Models (MLLMs). Think of these models as super-smart detectives who have read the entire internet and seen billions of pictures. They can "see" an image and "read" a question at the same time.

The team gathered 474 photos of hailstones from Austria (taken between 2022 and 2024). They knew the real size of the hail in these photos (the "ground truth") because they came from official storm databases. They wanted to see if the AI could guess the size just by looking at the pictures.

3. The Trick: The "Two-Step Dance"

They tested the AI with two different ways of asking the question:

The "Direct Shot" (Strategy 1): They just asked, "How big is this hail?"
- Result: The AI often got confused or guessed wildly, especially if the photo was taken from far away. It was like asking someone to guess the size of a car in a photo without any other objects for comparison.
The "Two-Step Dance" (Strategy 2): This was the winner.
- Step 1: The AI first looks for a reference object. It asks, "Is there a hand, a coin, a ruler, or a cigarette in this picture?"
- Step 2: Once it finds the reference (e.g., "Aha! That's a human hand!"), it uses its knowledge that "a hand is usually about X inches wide" to calculate the size of the hail next to it.
- Result: This was like giving the detective a ruler to hold up against the photo. The accuracy jumped significantly.

4. The Results: "Pretty Good, But Not Perfect"

The best AI model (GPT-4o) got the size right within about 1.12 centimeters on average.

The Good News: If a hailstone is 5 cm, the AI guessed 4.5 or 5.5 cm. That is close enough to tell a farmer, "Hey, this is a dangerous storm!"
The Bad News: The AI had a habit of being a bit shy. It tended to underestimate the size (guessing the hail was smaller than it really was). This is like a nervous witness saying, "I think the car was small," when it was actually huge.
The Hand Factor: The photos with human hands in them were the easiest for the AI. The error dropped to just 0.75 cm. The AI knows what a hand looks like, so it's a perfect ruler. Photos with no reference objects were much harder.

5. Why This Matters

Imagine a future where, during a storm, an automated system instantly scans social media, finds photos of hail, and tells the weather service: "People in this neighborhood are posting photos with hail the size of golf balls!"

This happens in real-time, covering areas where official sensors don't exist. It turns the "crowd" into a massive, distributed sensor network.

The Bottom Line

This paper proves that we don't need to build special, expensive cameras to measure hail anymore. We can just use the AI tools we already have, combined with the photos people are already taking.

The Analogy: It's like realizing you don't need a professional surveyor to measure a room; you just need a photo of the room with a person standing in it, and a smart computer to do the math.
The Catch: We still need to build the system that automatically finds these photos on social media in real-time. Once that's done, we'll have a super-powerful tool to track severe weather faster and more accurately than ever before.

Detection and Measurement of Hailstones with Multimodal Large Language Models

1. The Problem: The "Blind Spot"

2. The Solution: The "Super-Intelligent Detective"

3. The Trick: The "Two-Step Dance"

4. The Results: "Pretty Good, But Not Perfect"

5. Why This Matters

The Bottom Line

1. Problem Statement

2. Methodology

3. Key Contributions

4. Results

5. Significance and Future Work

Detection and Measurement of Hailstones with Multimodal Large Language Models

1. The Problem: The "Blind Spot"

2. The Solution: The "Super-Intelligent Detective"

3. The Trick: The "Two-Step Dance"

4. The Results: "Pretty Good, But Not Perfect"

5. Why This Matters

The Bottom Line

1. Problem Statement

2. Methodology

3. Key Contributions

4. Results

5. Significance and Future Work

More like this

Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web

Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation

Compositional Neuro-Symbolic Reasoning

Understanding the Nature of Generative AI as Threshold Logic in High-Dimensional Space

AIVV: Neuro-Symbolic LLM Agent-Integrated Verification and Validation for Trustworthy Autonomous Systems