Sequential Service Region Design with Capacity-Constrained Investment and Spillover Effect

This paper proposes a novel solution framework combining Real Options Analysis with a Transformer-based Proximal Policy Optimization algorithm to optimize sequential service region expansion under capacity constraints and stochastic spillover effects, demonstrating superior convergence and option value compared to existing deep reinforcement learning methods.

Tingting Chen, Feng Chu, Jiantong Zhang

Published 2026-03-10
📖 5 min read🧠 Deep dive

Imagine you are the CEO of a massive food delivery company. You want to expand your service to cover an entire country, but you have a limited budget and a small team. You can't open restaurants in every city tomorrow. You have to choose where to open first, when to open them, and how many to open at once.

This paper is about solving that exact puzzle, but with a high-tech twist. It tackles a problem called Sequential Service Region Design. Here is the breakdown in simple terms:

1. The Big Problem: The "Too Many Choices" Trap

Imagine you have 10 cities to conquer. You can open 2 or 3 at a time.

  • If you open City A first, does that make City B more popular? (Maybe people in City A start ordering from City B).
  • If you wait too long to open City C, do you lose customers to a competitor?
  • If you open too many at once, do you run out of money?

The number of possible ways to order these cities is astronomical. It's like trying to find the perfect path through a maze that has billions of branches. If you try to calculate every single path to find the "best" one, your computer would take years to finish.

2. The Two Hidden Rules

The authors added two real-world rules that make the puzzle harder but more realistic:

  • The "K-Region" Limit: You can't just pick one city at a time. You have a rule that says, "You can open at most K cities in a single year." This changes the game from picking a single city to picking a team (or portfolio) of cities every year.
  • The "Spillover" Effect: This is the magic ingredient. When you open a restaurant in one city, it doesn't just help that city. It creates a "ripple effect." Maybe people in a neighboring city start ordering more because they see your brand is growing nearby. The paper treats this ripple effect as a random, exciting surprise that changes the future demand.

3. The Solution: A "Time-Traveling" AI Coach

To solve this, the authors built a smart system that combines two powerful ideas:

A. Real Options Analysis (The "Time-Traveler")
In finance, a "Real Option" is like having a coupon that lets you buy something later if the price is right.

  • Imagine you have a coupon to open a restaurant in 2026. You don't have to use it today. You wait and see if the city becomes popular.
  • The paper uses a math method (called LSMC) to calculate the value of waiting. It asks: "Is it worth opening now, or should I wait to see if the 'ripple effect' makes the city more valuable later?"

B. The AI Coach (TPPO)
Since there are too many paths to check, they trained an AI using Deep Reinforcement Learning. Think of this AI as a chess grandmaster who learns by playing millions of games.

  • The Secret Sauce: They didn't just use a standard AI. They used a Transformer (the same technology behind tools like ChatGPT).
  • Why a Transformer? A standard AI might look at cities one by one. A Transformer looks at the whole map at once, understanding how City A relates to City B, City C, and the whole network. It understands the "ripple effects" much better.
  • The Training: The AI plays the game over and over. Every time it picks a group of cities, the "Time-Traveler" math (Real Options) tells it: "Good job! That sequence gave you a high value because you waited for the right moment." The AI learns to repeat those good moves.

4. What Did They Discover? (The "Aha!" Moments)

After running thousands of simulations on real data from Shanghai, Beijing, and New York, they found some surprising things:

  • Don't Rush the Big Fish: It seems logical to open in the biggest, busiest cities first. But the AI found the opposite! It's often better to start in smaller, quieter cities. Why? Because the "big" cities are so valuable that you want to keep the option to open them later, when you know more. Opening them too early locks you in.
  • The "Goldilocks" Speed: You shouldn't open too few cities (too slow) or too many (too risky). There is a "sweet spot" (usually opening 4 or 5 regions at a time) where you get the most value.
  • Teamwork Matters: The AI learned that certain cities should be opened together because they boost each other's demand. It's like opening a gym and a smoothie shop next to each other; they work better as a pair.
  • The AI Wins: When compared to simple strategies (like "always pick the biggest city" or "always pick the cheapest"), the AI found solutions that were 30% to 50% more profitable.

The Bottom Line

This paper teaches us that expanding a service network isn't just about picking the best locations. It's about timing and flexibility.

By using a smart AI that understands how cities influence each other (the spillover) and values the ability to wait (real options), companies can grow faster, spend less, and make more money than if they just guessed or followed old rules. It's the difference between blindly running a race and having a coach who knows exactly when to sprint and when to hold back.