A stochastic optimization algorithm for revenue maximization in a service system with balking customers

This paper proposes a stochastic gradient descent algorithm that dynamically maximizes revenue in a single-server queue with balking customers by using a novel Infinitesimal Perturbation Analysis procedure to estimate effective arrival rates based solely on observable joining behavior, thereby converging to the optimal price under mild regularity conditions.

Shreehari Anand Bodas, Harsha Honnappa, Michel Mandjes, Liron Ravner

Published 2026-03-05
📖 5 min read🧠 Deep dive

Imagine you run a very popular, single-lane coffee shop. You have one barista (the server), and customers arrive randomly. Your goal is simple: make the most money possible every hour.

To do this, you need to find the perfect price for your coffee.

  • If the price is too low, everyone rushes in. The line gets huge, people wait forever, and they get annoyed.
  • If the price is too high, people walk away immediately because it's too expensive.

The tricky part is that you don't know exactly how long the line will be or how long people will wait until you actually see them waiting. Plus, you can't ask the people who left (the ones who balked) why they left; you only see the people who actually bought a coffee.

This paper presents a smart, self-learning computer algorithm that acts as your "Price Manager." It figures out the perfect price on the fly, even though it can't see the people who walked away.

Here is how the paper breaks it down, using simple analogies:

1. The Problem: The "Invisible Crowd"

In many business models, you assume you know exactly how many people will show up at a certain price. But in a real queue (like a coffee shop or a server farm), it's a feedback loop:

  • High Price \rightarrow Fewer people join.
  • Low Price \rightarrow More people join \rightarrow The line gets longer \rightarrow People get annoyed and leave (this is called balking).

The authors' challenge is that you only see the "effective" arrivals (the people who actually joined). You don't see the "ghosts"—the people who saw the line, got scared, and left. Yet, you need to know about those ghosts to set the right price.

2. The Solution: A "Smart Guessing" Algorithm

The authors created a Stochastic Gradient Descent (SGD) algorithm. Think of this as a hiker trying to find the highest peak in a foggy mountain range (the peak is the maximum revenue).

  • The hiker can't see the whole mountain.
  • They can only feel the slope under their feet at their current spot.
  • They take a step in the direction that feels "uphill."
  • They repeat this, adjusting their steps, until they reach the top.

In this case, the "hiker" is the price. The "slope" is how much money you are making right now compared to the last price you tried. The algorithm nudges the price up or down to find the sweet spot.

3. The Secret Sauce: "Infinitesimal Perturbation Analysis" (IPA)

This is the most technical part, but here is the simple version:
Usually, to know how to change your price, you need to know how the entire system reacts. But since you can't see the people who left, you can't calculate the slope directly.

The authors invented a new way to estimate the slope using only the people who did join.

  • The Analogy: Imagine you are watching a river. You can't see the rain falling upstream (the people who left), but you can see the water level rising and falling.
  • The authors developed a mathematical trick (IPA) that looks at the tiny ripples in the water (the behavior of the people who joined) and uses them to mathematically reconstruct what the rain (the balking customers) must have been doing.
  • This allows the algorithm to "guess" the slope of the revenue curve accurately, even with missing data.

4. The "Window" Strategy

The algorithm doesn't change the price every second. It works in cycles or windows.

  • Set a price.
  • Wait for a while (a "window") to collect data on how many people joined and how long they waited.
  • Calculate the gradient (did we make more or less money?).
  • Adjust the price for the next window.

The paper spends a lot of time figuring out the perfect size for these windows.

  • Too small: You don't have enough data to know if the price change helped or hurt. The algorithm gets jittery and confused.
  • Too big: You waste time testing a bad price for too long before correcting it.
  • Just right: The algorithm learns fast and settles on the perfect price.

5. What Did They Prove?

The authors didn't just build a cool tool; they proved mathematically that it works.

  • Convergence: They showed that if you let the algorithm run long enough, it will always find the best possible price, no matter where it starts.
  • Regret: In the beginning, the algorithm might pick a bad price and lose some money. They calculated exactly how much money is "lost" while learning and proved that this loss grows very slowly compared to the total money you make.

Summary

This paper solves a complex puzzle: How do you set the perfect price for a service when customers might leave if the line is too long, and you can't see the people who left?

The answer is a self-correcting algorithm that uses a clever mathematical trick to infer the invisible customers' behavior from the visible ones. It learns by trial and error, adjusting its "step size" (the time it waits between price changes) to learn quickly without making too many mistakes.

In a nutshell: It's a robot barista that learns the perfect coffee price by watching who buys and who walks away, eventually figuring out the exact price that fills the shop without scaring anyone off.