From GEV to ResLogit: Spatially Correlated Discrete Choice Models for Pedestrian Movement Prediction

This paper demonstrates that for predicting high-frequency pedestrian movements near autonomous vehicles, a residual neural network logit (ResLogit) model outperforms traditional spatial generalized extreme value (GEV) specifications by more effectively capturing proximity-induced correlations while preserving model interpretability.

Rulla Al-Haideri, Bilal Farooq

Published 2026-03-03
📖 4 min read☕ Coffee break read

Imagine you are standing on a busy street corner, waiting to cross the road. A self-driving car (an AV) is approaching. In the split second before you move, your brain makes a tiny, almost automatic calculation: Do I speed up? Do I slow down? Do I step left or right?

This paper is about teaching computers to understand that split-second decision-making process, specifically how pedestrians move when interacting with self-driving cars.

Here is the breakdown of the research using simple analogies.

The Problem: The "Grid" of Choices

The researchers didn't try to predict exactly where a pedestrian's foot will land (like predicting a specific coordinate on a map). Instead, they imagined a 3x3 tic-tac-toe board floating in front of the pedestrian.

  • The Center: Keep walking at the same speed and direction.
  • The Top Row: Slow down.
  • The Bottom Row: Speed up.
  • The Left/Right Columns: Step left or right.

Every time a pedestrian moves, they are essentially picking one of these 9 squares. The challenge is that these squares are neighbors. If you pick "Step Left," it's very similar to picking "Step Left and Slow Down." They are so close that they are almost the same choice.

The Old Way: The "Rigid Nesting" Models (GEV)

For a long time, statisticians tried to solve this using "Spatial GEV" models. Think of these models as architects trying to build a house with pre-fabricated rooms.

  • They decided in advance: "Okay, the 'Left' square and the 'Left-Slow' square must be in the same 'room' (nest) because they are neighbors."
  • They built a rigid structure to force the computer to understand that these choices are related.
  • The Result: It worked a little bit better than doing nothing, but it was like trying to fit a round peg in a square hole. The "rooms" were too rigid. The real world is messy, and the pre-built rooms didn't quite capture how people actually make those tiny, fluid adjustments. The improvement was barely noticeable.

The New Way: The "Smart Tutor" (ResLogit)

The researchers then tried a new approach called ResLogit. Think of this not as an architect, but as a smart tutor.

  1. The Base Lesson: First, the tutor teaches the computer the basic rules of walking (e.g., "If a car is coming fast, slow down"; "If you are far from your destination, keep going"). This is the "Linear Utility" part—it's the human-readable logic.
  2. The Correction: Then, the tutor looks at thousands of real-world examples and says, "Wait, the basic rules aren't perfect. When the car is this close and the pedestrian is this tired, they actually tend to do this specific weird move."
  3. The Learning: The computer learns these tiny "corrections" automatically. It doesn't force the choices into rigid rooms; it learns that "Step Left" and "Step Left-Slow" are neighbors because it saw people make those mistakes or choices together in the data.

The Results: Why the "Smart Tutor" Won

When they tested these two approaches:

  • The Rigid Architect (GEV): Made very few mistakes, but mostly just guessed the most popular moves. It didn't really understand the subtle differences between the 9 squares.
  • The Smart Tutor (ResLogit): Got much better at predicting the right move.
    • The "Safe Mistake" Analogy: If the computer guesses wrong, the "Rigid Architect" might guess you will run across the street when you actually just stepped back. That's a dangerous, big error.
    • The "Smart Tutor," however, if it guesses wrong, it usually guesses a neighbor. For example, if you actually stepped "Left," the computer guessed "Left-Slow." That is a tiny, harmless error. In the world of self-driving cars, knowing you will step slightly left is much safer than not knowing at all.

The Big Takeaway

The paper proves that for high-speed, crowded situations where people make tiny, split-second adjustments, letting the computer learn the patterns from data (ResLogit) works better than forcing the computer to follow strict, pre-written rules (GEV).

However, the best part is that the "Smart Tutor" still explains why it made its decision. It didn't become a "black box." It still says, "I predicted you would slow down because the car is close," but it added a little extra "nudge" based on what it learned from real people.

In short: To teach a robot how humans walk near cars, don't just give it a rigid map. Give it a set of basic rules, and then let it learn from the messy, real-world experience of how people actually move.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →