Beyond Local Code Optimization: Multi-Agent Reasoning for Software System Optimization

This paper introduces a multi-agent framework that moves beyond local code optimization to enable whole-system performance reasoning for microservices, achieving significant throughput and response time improvements by coordinating specialized agents to analyze architectural dependencies and construct cross-component optimization strategies.

Huiyun Peng, Parth Vinod Patil, Antonio Zhong Qiu, George K. Thiruvathukal, James C. Davis

Published 2026-03-17
📖 4 min read☕ Coffee break read

Imagine you are the manager of a massive, bustling restaurant. Your goal is to serve more customers faster without making the food taste bad or breaking the kitchen.

The Old Way: The "Single Chef" Approach

For a long time, AI tools trying to speed up software were like single chefs looking at one specific recipe card at a time.

  • If a chef saw a step that took too long (like chopping onions), they would suggest a faster knife.
  • The Problem: This chef doesn't know that the next station is waiting for those onions, or that the oven is already full, or that the delivery driver is stuck in traffic. They only see the one task in front of them. They might speed up chopping, but if the oven is the bottleneck, the whole restaurant is still slow.

This is what most current AI optimization tools do: they fix small pieces of code (like a single function) but miss the big picture of how different parts of the software talk to each other.

The New Way: The "Conductor's Orchestra"

This paper introduces a new system called Multi-Agent Reasoning. Instead of one chef, imagine a team of specialized experts working together, led by a conductor, to optimize the entire restaurant at once.

Here is how their team works:

1. The Map Makers (Summarization Agents)

Before fixing anything, this team creates a giant, detailed map of the restaurant.

  • The Architect: Draws the layout of the kitchen, the dining room, and the delivery routes (System Structure).
  • The Flow Tracker: Watches how orders move from the host to the kitchen to the table, noting where people bump into each other (Control Flow).
  • The Context Giver: Checks the weather, the number of staff on shift, and the type of stove being used (Environment).
  • Why it matters: They don't just look at a recipe; they understand how the whole building operates.

2. The Detectives (Analysis Agents)

Using the maps, these agents act like detectives looking for the real traffic jams.

  • They might say, "Hey, the problem isn't that the chefs are slow; it's that the delivery driver is waiting 5 minutes for a table to clear up because the host isn't calling them fast enough."
  • They find bottlenecks that happen between different parts of the system, not just inside one part.

3. The Fixers (Optimization Agents)

Once the problem is found, the fixers propose changes.

  • Instead of just telling the chef to chop faster, they might say, "Let's change the layout so the delivery driver can walk straight to the table," or "Let's have the host call the driver before the table is fully cleared."
  • Crucially, they promise not to break anything. They won't change the menu (the public interface) or serve bad food (break the code). They only tweak the internal process.

4. The Inspectors (Verification Agents)

Before the changes go live, the inspectors run a test.

  • They simulate a rush hour to see if the new layout actually works.
  • They check: "Did we serve more people? Did the food still taste good? Did anyone get hurt?"
  • If it works, the changes are applied. If not, they go back to the drawing board.

The Results: A Faster Restaurant

The authors tested this "team of experts" on a real software system called TeaStore (a fake online shop).

  • Before: The system could handle about 1,200 requests per second.
  • After: With the AI team's help, it handled 1,635 requests per second.
  • Speed: The average time to get a response dropped by nearly 28%.

Why This Matters

Think of it like upgrading from a bicycle to a high-speed train.

  • The old AI tools were like fixing a loose bolt on a bicycle wheel. It helps a little, but you're still limited by the bike.
  • This new system looks at the tracks, the engine, the schedule, and the passengers to build a train. It understands that the speed of the whole system depends on how all the parts work together, not just how fast one gear turns.

In short: This paper shows that by using a team of AI agents that talk to each other and look at the "big picture" of software architecture, we can make complex computer systems significantly faster and more efficient than ever before.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →