Relational event models with global covariates

This paper proposes an innovative sampling approach using temporally shifted non-events to enable efficient estimation of global covariate effects within relational event models, demonstrating its effectiveness through simulations and a case study on Washington D.C. bike-sharing data that reveals significant impacts of weather and time of day on ride dynamics.

Melania Lembo, Rūta Juozaitienė, Veronica Vinciotti, Ernst C. Wit

Published 2026-03-10
📖 4 min read☕ Coffee break read

Imagine you are trying to understand why people are riding bikes in Washington D.C. You have a massive list of every single bike trip taken in July 2023—about 350,000 trips. Each trip is a "story" connecting two bike stations at a specific time.

The authors of this paper are statisticians who want to figure out what drives these stories. They know that some things depend on the specific stations involved (like how far apart they are), but they also suspect that global factors—things that affect everyone at the same time, like the weather or the time of day—are huge drivers.

Here is the problem: Traditional statistical tools used for this kind of data are like a pair of glasses that only let you see the specific stations. They automatically "cancel out" the weather and time of day because, mathematically, those factors look the same for every single trip happening at that moment. It's like trying to hear a specific instrument in an orchestra while the conductor keeps turning down the volume on the whole band.

The Solution: The "Time-Shifted" Trick

To fix this, the authors invented a clever mathematical trick. Let's use an analogy:

The Analogy: The "Time-Traveling" Race

Imagine a race where runners (bike trips) start at different times.

  1. The Old Way: You look at all the runners starting at 9:00 AM. You ask, "Why did Runner A beat Runner B?" You look at their shoes and their training (station-specific data). But you ignore the fact that it was raining at 9:00 AM, because everyone was running in the rain. The rain cancels out in the math.
  2. The New Way (The Paper's Method): The authors say, "Let's mess with time." They take every single runner and give them a random, tiny "time shift."
    • Runner A (who actually started at 9:00 AM) is now analyzed as if they started at 9:05 AM.
    • Runner B (who actually started at 9:00 AM) is analyzed as if they started at 8:55 AM.

Why does this help?
Now, when the math looks at Runner A, it asks, "How did you do at 9:05 AM?" (Maybe it was sunny). When it looks at Runner B, it asks, "How did you do at 8:55 AM?" (Maybe it was still raining).

Because the runners are now being judged at different times, the weather (the global factor) no longer cancels out! The math can finally see that "Oh, people ride more when it's sunny at 9:05 AM than when it's raining at 8:55 AM."

The "Sampling" Shortcut

There's a catch. If you have 350,000 trips and millions of possible station pairs, checking every single possibility is like trying to count every grain of sand on a beach. It takes too long and crashes computers.

To solve this, the authors use a technique called Nested Case-Control Sampling.

  • The Analogy: Instead of interviewing every single person in a city to find out who bought a bike, you pick one person who bought a bike (the "Case") and then pick just one random person who didn't buy a bike at that exact moment (the "Control").
  • You compare these two. By repeating this process thousands of times, you can accurately predict the trends for the whole city without interviewing millions of people. This makes the math fast enough to run on a normal laptop.

What Did They Find?

Using this new "Time-Traveling + Sampling" method on the Washington D.C. bike data, they discovered some very intuitive but previously hard-to-quantify truths:

  1. The Goldilocks Temperature: People love biking when it's warm, but if it gets too hot, they stop. It's not a straight line; it's a curve.
  2. Rain is a Dealbreaker: Even a little bit of rain stops people from riding.
  3. The Commuter Rhythm: There are huge spikes in riding at 9 AM (going to work) and 6 PM (coming home). The "rush hour" effect is massive.
  4. The "Competition" Surprise: They looked at whether having many bike stations close together helps or hurts. Surprisingly, having a station right next to another one didn't make people ride more. In fact, it seemed to slightly lower the activity, perhaps because the area was already saturated or the stations were too close to be useful for distinct trips.

The Big Picture

This paper is a toolkit for statisticians. It gives them a way to stop ignoring the "big picture" factors (like weather, holidays, or time of day) when studying how things connect over time.

By shifting the clock slightly and sampling smartly, they turned a mathematically impossible problem into a solvable one. This means city planners can now build better bike networks, knowing exactly how the weather and the clock influence whether people will hop on a bike or stay home.