UrbanHuRo: A Two-Layer Human-Robot Collaboration Framework for the Joint Optimization of Heterogeneous Urban Services

This paper proposes UrbanHuRo, a two-layer human-robot collaboration framework that jointly optimizes heterogeneous urban services like crowdsourced delivery and sensing through scalable order dispatch and deep reinforcement learning, achieving significant improvements in sensing coverage, courier income, and order timeliness.

Tonmoy Dey, Lin Jiang, Zheng Dong, Guang Wang

Published 2026-03-05
📖 5 min read🧠 Deep dive

Imagine a bustling city as a giant, chaotic dance floor. On one side, you have human couriers (like food delivery drivers) rushing around to drop off pizzas and burgers. On the other side, you have sensing robots (autonomous vehicles) driving around to collect data about traffic, air quality, and road conditions.

For a long time, these two groups danced to their own separate tunes. The humans focused only on getting food to people fast, while the robots focused only on mapping the city. They ignored each other, even though they were often driving down the same streets at the same time.

The Problem:
The city was inefficient.

  • The robots were driving empty-handed, missing chances to help deliver food.
  • The humans were driving full of food, missing chances to help the robots gather data.
  • It was like having a team of chefs who never wash dishes, and a team of dishwashers who never cook. Everyone is working hard, but the restaurant isn't running smoothly.

The Solution: UrbanHuRo
The authors of this paper created a new "dance instructor" called UrbanHuRo. Think of it as a smart, two-layer brain that coordinates the humans and robots so they can help each other without getting in the way.

Here is how it works, using simple analogies:

Layer 1: The "Smart Dispatcher" (The Matchmaker)

Imagine a busy restaurant kitchen. The manager (the Dispatcher) has to decide who takes which order.

  • Old Way: The manager just looks at who is closest to the customer.
  • UrbanHuRo Way: The manager looks at the whole picture. "Hey, Driver A is going to the park anyway. Let's give them a pizza order and tell them to take a quick air-quality reading on the way. Meanwhile, Robot B is free; let's send it to help with a rush order so the human driver doesn't get overwhelmed."

To do this mathematically without getting a computer headache, they used a MapReduce system. Think of this like a massive group project in school. Instead of one teacher trying to grade 1,000 papers alone, they split the papers among 50 students (computers), who grade their own piles quickly, and then the teacher combines the results. This allows the system to make thousands of decisions in the blink of an eye, even when the city is crazy busy.

Layer 2: The "Robot Navigator" (The GPS with a Brain)

Once the orders are assigned, the robots need to know where to go next.

  • The Human Couriers: They are free agents. They know the city best and will naturally take the fastest, most profitable routes. The system trusts them to do their thing.
  • The Robots: They need instructions. The system uses a Deep Learning algorithm (like a video game AI that learns by playing thousands of times) to tell the robots where to drive.
    • If a robot is carrying food, it prioritizes getting the food there on time.
    • If a robot is empty, it drives to areas the city hasn't "seen" yet to gather fresh data.
    • Crucially, the robot learns to balance these two goals. It knows, "If I stop to take a photo of a pothole, I might be late with the pizza, so I'll skip the photo and deliver first."

The Magic Trick: "Hybrid Rewards"

The hardest part of this dance is that the two goals (delivering food vs. gathering data) often conflict.

  • The Solution: The system uses a "hybrid score." It doesn't just ask, "Did we get the food there?" It also asks, "Did we get extra value from the trip?"
  • If a human driver delivers a pizza and happens to pass a smoggy area, the system gives them credit for both the pizza and the data.
  • If a robot helps deliver a pizza, it gets credit for the delivery and the fact that it cleared the way for more data collection later.

The Results: A Win-Win Dance

When they tested this system using real data from Shanghai (with over 160,000 food orders), the results were amazing:

  1. Fewer Late Pizzas: The number of overdue orders dropped significantly. The robots helped the humans when things got busy, acting like a safety net.
  2. Better City Data: The system covered 29.7% more ground for sensing than previous methods. The robots and humans covered more territory together than they ever could apart.
  3. Happier Couriers: The human drivers earned 39.2% more money on average. Because fewer orders were late, they didn't get penalized, and the system sent them more efficient routes.

The Bottom Line

UrbanHuRo is like a conductor for a city orchestra. Instead of the drums (delivery) and the violins (sensing) playing out of sync, it gets them to play a harmonious duet. The humans get paid more and work less stressfully, the robots get more done, and the city gets cleaner air and better traffic data—all because everyone started helping each other out.