Automated Quality Check of Sensor Data Annotations

This paper presents an open-source tool that automates the quality assurance of multi-sensor railway training data by detecting nine common annotation errors with high precision, thereby significantly reducing manual workload and accelerating the development of AI-driven automated driving systems.

Niklas Freund, Zekiye Ilknur-Öz, Tobias Klockau, Patrick Naumann, Philipp Neumaier, Martin Köppel

Published 2026-03-03
📖 4 min read☕ Coffee break read

Imagine you are training a brand-new robot to drive a train. This robot doesn't have eyes or a brain like a human; instead, it has cameras, lasers, and radar sensors that feed it a constant stream of data. To teach this robot how to see the world—where the tracks are, where the people are, and where the obstacles are—you have to show it millions of examples.

But here's the catch: Garbage in, garbage out. If you teach the robot with bad examples, it will make dangerous mistakes.

This paper is about a team from Deutsche Bahn (Germany's national railway) who built a "smart teacher" to check their training examples before they ever reach the robot.

The Problem: The "Human Eye" Bottleneck

Think of the data they collect as a massive library of photos and 3D maps. In these photos, humans have drawn boxes around trains, people, and tracks to tell the AI, "This is a person," or "This is the rail."

In the past, checking these drawings was like a teacher grading a stack of 200,000 homework papers by hand. It's slow, tiring, and humans get tired and miss mistakes. If a teacher misses a mistake on a homework paper, the student (the AI) learns the wrong lesson.

The Solution: The "Auto-Grader"

The team built a piece of software (an open-source tool called RailLabel-providerkit) that acts like an ultra-fast, tireless auto-grader. Instead of reading every single paper, it scans the whole library in seconds, looking for nine specific types of "silly mistakes" that humans often make when drawing these boxes.

Here are the nine mistakes it looks for, explained with everyday analogies:

  1. The "Sky-High" Track: Imagine someone drawing a railroad track that goes up into the clouds. The software says, "Wait, trains don't fly. That's wrong!"
  2. The "Giant" Person: If the software sees a drawing of a person that is 10 feet tall (3 meters), it flags it. "That's not a human; that's a giant!"
  3. The "Confused" Label: If a sign says "Catenary Pole" (a pole holding wires) is "structured" in one photo but "solid" in another, the software gets confused. "You can't be two different things at once!"
  4. The "Missing ID": If a train is drawn but has no name tag (ID number), the software raises a hand. "Who are you? I need your ID!"
  5. The "Wrong Species": If someone accidentally labels a dog as a "Person," the software corrects them. "That's a dog, not a human!"
  6. The "Lost Train": The train the camera is riding on (the "Ego" train) must be clearly marked. If the software can't find the train it's supposed to be on, it sounds an alarm.
  7. The "Double-Left" Track: A normal track has one left rail and one right rail. If the drawing shows two left rails, the software says, "Physics doesn't work that way."
  8. The "Backwards" Track: If the left rail is drawn on the right side and vice versa, the software flips it. "You got your left and right mixed up!"
  9. The "Looping" Path: If a track is supposed to switch to a new line but the drawing shows it starting and ending on the same line, the software says, "You didn't actually go anywhere."

How Good Is It?

The team tested this "Auto-Grader" on a real dataset that had already been checked by humans. They wanted to see if their software could find the mistakes the humans missed.

The results were impressive:

  • Six out of nine of the checks were 100% perfect. Every time the software said "Error," it was actually an error.
  • The other three checks were 96-97% accurate. They made very few mistakes, mostly flagging things that were technically "wrong" but actually just weird, real-world edge cases (like a construction vehicle with a giant arm that looked too big).

Why This Matters

By using this tool, the railway company can:

  • Save Time: They don't have to manually check every single photo.
  • Save Money: They can process data faster.
  • Increase Safety: They ensure the AI driving the trains is learning from perfect data, which is crucial when lives are at stake.

The "Open Door" Policy

The best part? The team didn't keep this tool a secret. They released the code for free (Open Source) so that other researchers and companies can use it. It's like they built a better grading machine and handed the blueprints to the whole world so everyone can build safer, smarter trains.

In short: They built a digital inspector that catches the silly mistakes in train data, ensuring that the AI learning to drive our trains gets the best possible education.