Estimating Treatment Effects under Algorithmic Interference: A Structured Neural Networks Approach

This paper proposes a structured semiparametric framework combining an algorithm choice model and a viewer response model to correct the severe bias in standard estimators caused by algorithmic interference in two-sided marketplaces, thereby enabling accurate global treatment effect estimation for platform-wide algorithm updates.

Ruohan Zhan, Shichao Han, Yuchen Hu, Zhenling Jiang

Published Tue, 10 Ma
📖 5 min read🧠 Deep dive

Imagine you are the manager of a massive, bustling digital town square (like TikTok or YouTube). In this square, thousands of street performers (creators) are trying to get the attention of passersby (viewers). To help the best performers get noticed, you have a "Traffic Director" (the algorithm) who decides who gets to stand on the main stage and who gets pushed to the back.

Every few months, you want to test a new Traffic Director to see if it does a better job than the old one. But here's the problem: You can't just watch the new director work in isolation.

The Problem: The "Crowded Stage" Effect

In the real world, when you test a new director, you can't give them their own private stage. You have to mix the new director's performers with the old director's performers on the same stage.

This creates a tricky situation called Algorithmic Interference.

Think of it like a game of musical chairs, but the chairs are "viewer attention."

  • If your new director is really good at spotting talent, they might push their performers to the front of the line.
  • Because there are only a few spots on the main stage, those new performers push out the old performers.
  • Suddenly, the old performers get fewer chances to perform, not because they are bad, but because the new director is too good at picking winners.

If you just count who got the most applause (likes/views) at the end of the day, you might get the wrong answer:

  1. The "Crowding Out" Bias: You might think the new director is amazing because their performers got more applause. But really, you just stole the spotlight from the old performers.
  2. The "Wrong Crowd" Bias: The new director might be great at finding performers who appeal to boring people, while the old director appealed to exciting people. If you only look at the applause, you might miss that the new director is actually annoying the wrong audience.

The Result: The standard way of measuring success (just comparing the two groups) is like judging a race where the runners are tripping over each other. You might declare a winner who actually lost, or a loser who actually won. In the real world, this could cost a company billions of dollars by rolling out a bad algorithm.

The Solution: The "Crystal Ball" Simulator

The authors of this paper (Ruohan Zhan, Shichao Han, et al.) built a smart simulator to fix this. Instead of just watching the race and guessing, they built a model that understands how the race works.

They used two main tools, like a detective solving a mystery:

  1. The "Choice Model" (The Traffic Director's Brain):
    They built a neural network (a type of AI) to figure out exactly how the algorithm decides who gets on stage. It learns: "When a viewer likes cats, and Creator A is a cat video, Creator A gets a high score. But if Creator B is also a cat video, the score changes."
    This model understands the competition. It knows that if Creator A gets a boost, Creator B might get pushed down.

  2. The "Response Model" (The Audience's Heart):
    They built another AI to predict how viewers react once they actually see a video. "If a viewer sees a cat video, how likely are they to laugh?"

The Magic Trick (Debiasing):
Once they have these two models, they don't just look at the real-world data. They run a simulation in their computer:

  • Scenario A: What if everyone used the new director? (The AI simulates the new director picking winners, pushing the old ones out, and predicts the total applause).
  • Scenario B: What if everyone used the old director? (The AI simulates the old director picking winners).

By comparing these two simulated worlds, they can see the true effect of the new algorithm, completely ignoring the messy interference of the mixed experiment.

They call this a "Debiased Estimator." It's like having a pair of glasses that filters out the noise of the crowd so you can see the true performance of the runners.

Why This Matters (The "Double-Sided" Cost)

Usually, to get the perfect answer, companies try to do a "Double-Sided Experiment." This is like building two separate, parallel universes:

  • Universe 1: Only the new director and their performers.
  • Universe 2: Only the old director and their performers.

This gives a perfect answer, but it's expensive and dangerous. You cut your audience in half, so the market feels "thin" (less variety), and you need twice the engineering work. It's like testing a new car engine by building two separate garages instead of just testing it on the road.

The authors' method is brilliant because it lets you test the new engine on the same road (the standard experiment) but uses their "Crystal Ball" math to tell you what would have happened if you had built two separate garages.

The Real-World Test

The team tested this on Weixin Channels (a huge short-video platform in China).

  • They ran a standard experiment mixing old and new algorithms.
  • They also ran the expensive "two universes" test just to see the truth.
  • The Result: The standard methods (just counting likes) said the new algorithm was great. The "Two Universes" test said the new algorithm was actually terrible (it hurt the platform).
  • The Winner: The authors' new "Debiased Estimator" correctly predicted that the new algorithm was terrible, matching the expensive "Two Universes" test.

The Takeaway

In a world where algorithms are constantly fighting for our attention, simple math isn't enough. You can't just compare Group A and Group B because they are influencing each other.

This paper gives companies a smart, mathematical telescope that lets them see the true impact of their changes without having to build expensive, parallel worlds. It saves money, prevents bad decisions, and ensures that the "Traffic Director" we choose actually helps everyone, not just a lucky few.