DriveCode: Domain Specific Numerical Encoding for LLM-Based Autonomous Driving

This paper introduces DriveCode, a novel numerical encoding method that represents numbers as dedicated embeddings instead of discrete tokens to overcome precision and efficiency limitations, thereby significantly improving trajectory prediction and control signal generation in LLM-based autonomous driving systems.

Zhiye Wang, Yanbo Jiang, Rui Zhou, Bo Zhang, Fang Zhang, Zhenhua Xu, Yaqin Zhang, Jianqiang Wang

Published 2026-03-03
📖 4 min read☕ Coffee break read

Imagine you are teaching a brilliant, well-read robot how to drive a car. This robot is a Large Language Model (LLM)—think of it as a super-smart librarian who has read every book in the world. It understands stories, traffic laws, and can describe a beautiful sunset perfectly.

However, there's a problem: The robot is terrible at math.

The Problem: The "Word-Counting" Robot

In traditional AI, numbers are treated just like words. If the robot sees the number 3.14, it doesn't see "three point one four." Instead, it sees three separate "tokens" (like puzzle pieces): 3, ., and 14.

To the robot, 3.14 is just a sequence of symbols, like the word "apple." It doesn't inherently understand that 3.14 is bigger than 3.05, or that 10.0 is exactly double 5.0. It's like asking a librarian to compare the weight of two books just by looking at their titles. They might guess, but they often get it wrong.

In autonomous driving, this is dangerous. If the robot thinks a car is moving at 3.14 m/s but actually needs to stop for something at 3.15 m/s, that tiny misunderstanding could lead to a crash. The robot needs to understand numbers as continuous quantities (like a smooth slider on a volume knob), not as broken-up text fragments.

The Solution: DriveCode

The paper introduces DriveCode, a new way to teach this robot to "feel" numbers.

Here is the analogy:

  • Old Way (Text Tokens): Imagine you are trying to tell the robot the speed of the car. You say, "The speed is three point one four." The robot has to piece these words together to guess the number. It's clunky and imprecise.
  • DriveCode Way (Continuous Embeddings): Instead of speaking in words, you hand the robot a special, smooth dial that is already set to exactly 3.14. You don't say the words; you just hand over the physical value.

How It Works (The "Translator" and the "Math Head")

The researchers built two special tools to make this happen:

  1. The Number Projector (The Translator):
    When the robot reads a prompt like "The car is going 50 mph," the system grabs the number 50 before it turns into a word. It runs it through a special translator (the projector) that turns the raw number into a "math language" the robot understands. This math language is then mixed in with the pictures and the text, so the robot sees the number as a real, physical value, not just a word.

  2. The Number Head (The Math Head):
    When the robot needs to answer, "What speed should I go?", it doesn't have to spell out "f-o-u-r" or "f-i-v-e." Instead, it has a dedicated "Math Head" that can simply point to a number on a dial and say, "Go 4.5." It skips the step of breaking the number into letters.

Why This Matters

Think of driving as a tightrope walk.

  • Without DriveCode: The robot is walking the tightrope while trying to count its steps by reading a book. It's slow, and it might trip because it miscounts a step.
  • With DriveCode: The robot has a built-in sense of balance. It feels the wind and the rope directly. It can make micro-adjustments instantly because it understands the exact value of its speed and steering angle.

The Results

The researchers tested this on three different driving datasets (like different driving schools).

  • Accuracy: The robot made fewer mistakes in predicting where the car should go and how fast it should drive.
  • Speed: Because the robot doesn't have to "spell out" numbers one letter at a time, it can make decisions faster. It's like the difference between writing a number by hand (slow) versus pressing a button that instantly displays the number (fast).

In a Nutshell

DriveCode is like giving a language genius a pair of glasses that lets them see numbers as real, physical objects rather than just words on a page. This allows AI cars to drive more safely, more precisely, and more like a human who intuitively understands speed and distance, rather than a computer that is just guessing based on spelling.