Scaling Multilingual Semantic Search in Uber Eats Delivery

Imagine you walk into a massive, chaotic library that contains everything from pizza recipes to grocery lists and restaurant menus for the entire world. In the past, if you asked the librarian, "I want something spicy," they might only look for the exact words "spicy" in the book titles. If you asked in Spanish, they might not understand you at all. If you wanted a specific dish and a specific grocery item, they'd have to run to two different sections of the library, slowing everything down.

This paper describes how Uber Eats built a new, super-smart librarian (a search engine) that solves all these problems at once. Here is the story of how they did it, explained simply.

1. The Problem: Too Many Librarians, Too Many Rules

Previously, Uber Eats had different search systems for different things: one for restaurants, one for dishes, and one for groceries. They were like separate librarians who didn't talk to each other.

The Issue: If you searched for "tacos," the restaurant librarian might find a taco place, but the grocery librarian wouldn't know you might also want taco shells. It was slow, expensive to maintain, and often missed what you really wanted.

2. The Solution: The "Universal Translator" Librarian

The team built one single, super-smart librarian who understands everything.

The Brain (Qwen2): They used a powerful AI brain (based on a model called Qwen2) that already knows a lot about the world. Think of this as hiring a librarian who has read every book in the world before they even started working at Uber Eats.
The Two Towers: Imagine the librarian has two distinct roles:
1. The Listener (Query Tower): This part listens to what you type ("I'm hungry for Italian").
2. The Cataloger (Document Tower): This part knows everything about the millions of restaurants and items in the database.
- Instead of reading every single book to find an answer, the librarian converts both your question and the books into secret codes (embeddings). If the codes match, the librarian knows it's a good match, even if the words are different!

3. The Training: Learning from Millions of Mistakes

You can't just hire a smart librarian and expect them to know Uber Eats immediately. They need to learn.

Phase 1 (The Crowd): They showed the librarian hundreds of millions of examples of what people actually clicked on or added to their carts. This taught the librarian the general rules of "what people usually want."
Phase 2 (The Tough Questions): Then, they used a "tough coach" (another AI) to find the hardest, trickiest questions where the librarian was confused. They practiced on these specifically so the librarian wouldn't make the same mistakes twice.

4. The Magic Trick: The "Matryoshka" Dolls

Here is the coolest part. The librarian creates a very long, detailed secret code (1,536 numbers long) for every item.

The Problem: Storing and searching through these long codes for billions of items is slow and expensive, like trying to carry a giant encyclopedia everywhere.
The Solution (MRL): They used a technique called Matryoshka Representation Learning. Think of a Russian nesting doll.
- The whole doll is the full, detailed code (high quality, but big).
- But, you can take off the outer layers, and the inner doll is still a perfect, smaller version of the same thing.
- Why it matters: If you need a fast answer on a phone with a slow connection, the system uses the small inner doll (fast, cheap). If you need the absolute best answer for a big screen, it uses the full doll. One model does it all.

5. The Results: Faster, Smarter, and Cheaper

When they turned this new system on:

Better Matches: They found the right restaurants and dishes much more often, even in different languages (Spanish, French, Japanese, etc.).
Fewer Empty Searches: People stopped getting "No results found" messages.
More Orders: Because people found what they wanted faster, they ordered more food.
Cost Savings: By using the "smaller dolls" (truncated codes) and compressing the data (quantization), they saved a huge amount of money on computer storage and speed, without losing quality.

The Bottom Line

Uber Eats stopped using a bunch of clumsy, separate tools and built one unified, multilingual, super-smart search engine. It learns from real user behavior, adapts to your speed needs on the fly (like a shape-shifting robot), and helps you find your next meal much faster. It's like upgrading from a dusty card catalog to a psychic librarian who knows exactly what you're craving before you even finish typing.

Scaling Multilingual Semantic Search in Uber Eats Delivery

1. The Problem: Too Many Librarians, Too Many Rules

2. The Solution: The "Universal Translator" Librarian

3. The Training: Learning from Millions of Mistakes

4. The Magic Trick: The "Matryoshka" Dolls

5. The Results: Faster, Smarter, and Cheaper

The Bottom Line

1. Problem Statement

2. Methodology

A. Data and Input Representation

B. Model Architecture

C. Training Strategy (Two-Stage)

D. Scoring and Serving

3. Key Contributions

4. Results

Offline Performance

Online Business Impact (A/B Testing)

5. Significance and Future Work

Scaling Multilingual Semantic Search in Uber Eats Delivery

1. The Problem: Too Many Librarians, Too Many Rules

2. The Solution: The "Universal Translator" Librarian

3. The Training: Learning from Millions of Mistakes

4. The Magic Trick: The "Matryoshka" Dolls

5. The Results: Faster, Smarter, and Cheaper

The Bottom Line

1. Problem Statement

2. Methodology

A. Data and Input Representation

B. Model Architecture

C. Training Strategy (Two-Stage)

D. Scoring and Serving

3. Key Contributions

4. Results

Offline Performance

Online Business Impact (A/B Testing)

5. Significance and Future Work

More like this

Monotone Comparative Statics without Lattices

Motion Illusions Generated Using Predictive Neural Networks Also Fool Humans

Performance Analysis of IEEE 802.11p Preamble Insertion in C-V2X Sidelink Signals for Co-Channel Coexistence

Construction of time-varying ISS-Lyapunov Functions for Impulsive Systems

Real-Time BDI Agents: a model and its implementation