A Spatio-temporal Graph Network Allowing Incomplete Trajectory Input for Pedestrian Trajectory Prediction

The paper proposes STGN-IT, a spatio-temporal graph network that enables accurate pedestrian trajectory prediction even when historical input data is incomplete by encoding observation states and incorporating static obstacles as graph nodes.

Juncen Long, Gianluca Bardaro, Simone Mentasti, Matteo Matteucci

Published 2026-02-17
📖 4 min read☕ Coffee break read

Imagine you are a robot trying to walk through a busy coffee shop. Your job is to predict where the customers will be in the next few seconds so you don't bump into them.

Most robot brains (algorithms) today have a very strict rule: "If I can't see you clearly for the entire time you've been in my view, I will ignore you completely."

This is a problem. In a real coffee shop, people get blocked by pillars, other people, or the robot's own body. If a customer steps behind a counter for a second, a standard robot stops tracking them. It thinks, "Oh, they vanished! I'll just pretend they don't exist." This is dangerous because that person might step right in front of the robot a moment later.

This paper introduces a new robot brain called STGN-IT that solves this problem. Here is how it works, using some simple analogies:

1. The "Ghost" Problem (Incomplete Trajectories)

Imagine you are playing a game of tag, but every time a player hides behind a tree, the referee erases them from the scoreboard. The other players stop looking for them.

  • Old Way: If a pedestrian is hidden (occluded), the robot deletes them from its memory.
  • STGN-IT's Way: The robot says, "I can't see you right now, but I know you were there a second ago. I will mark your spot as 'Ghost Mode' and keep guessing where you might go." It doesn't delete the person; it just marks them as "temporarily invisible."

2. The "Smart Map" (Occupancy Grid)

Most robots use a fancy, hand-drawn map of the world. But in a real, messy environment, you need a map that updates itself instantly.

  • The Analogy: Think of a foggy window. If you wipe a spot on the glass, you see the world clearly. STGN-IT uses a "Point Cloud" (like a 3D laser scan) to automatically create a "Foggy Window Map" (Occupancy Grid). It doesn't need a human to tell it where the walls are; it just sees the obstacles (walls, chairs, pillars) and adds them to its mental map instantly.

3. The "Two-Step Dance" (The Prediction Process)

STGN-IT doesn't just guess once; it dances in two steps to get it right.

  • Step 1: The Wild Guess. The robot looks at where people are walking and guesses where they will go next, ignoring the walls for a moment.
  • Step 2: The Reality Check. The robot looks at its "Foggy Window Map." It sees, "Oh, I predicted this person would walk straight into a wall!" So, it adds the wall into its calculation as a "player" in the game. It re-runs the prediction, thinking, "Okay, since there's a wall here, the person will probably turn left instead."

4. The "Grouping" Trick (Clustering)

When there are 50 people in a room, it's hard to track who is interacting with whom.

  • The Analogy: Imagine a crowded dance floor. Instead of trying to track everyone individually, STGN-IT uses a "Clustering" algorithm to group people who are close together or moving together. It's like saying, "Okay, that group of three is moving as a unit," which makes it much easier for the robot's brain to understand the flow of traffic.

5. The "Code" for Hiding

How does the robot know the difference between a person who is actually standing at the center of the room (0,0) and a person who is just hidden behind a pillar?

  • The Solution: STGN-IT uses a special "ID Card" (Encoding). When a person is hidden, the robot doesn't just put their coordinates at zero; it attaches a special tag that says, "I am here, but I am currently invisible." This prevents the robot from getting confused and thinking the person magically teleported to the center of the room.

Why Does This Matter?

The authors tested this on a dataset called STCrowd, which simulates a robot's view (where things get blocked easily).

  • Old Robots: When people got blocked, the robots stopped predicting them, leading to potential collisions.
  • STGN-IT: Even when people were partially hidden, STGN-IT kept predicting their path smoothly. It was better at avoiding walls and other people than any other robot brain tested.

In short: STGN-IT is a robot navigator that doesn't panic when it loses sight of someone. It keeps a mental note of "ghosts," checks the real-time map for walls, and uses a two-step thinking process to predict exactly where people will go, making it much safer for robots to walk among humans.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →