WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows

This paper introduces WORKSWORLD, a new domain for numeric planners that automates the joint planning and scheduling of distributed data pipelines by allowing users to define high-level goals without specifying the entire workflow graph, demonstrating the ability to solve complex multi-site problems on commodity hardware.

Taylor Paul, William Regli

Published 2026-03-13
📖 6 min read🧠 Deep dive

Imagine you are the Grand Conductor of a massive, global orchestra.

Your orchestra isn't made of musicians, but of computers, servers, and data centers scattered all over the world (some in big clouds, some in small foggy neighborhoods, and some right on the edge of town). Your job is to take raw ingredients (data) from various farms, cook them into delicious meals (processed information), and serve them to hungry customers (applications) in specific formats, all while keeping the bill low and the wait time short.

The paper you shared, "WORKSWORLD," is about building a super-smart AI Conductor that can figure out the perfect recipe for this orchestra, even when the kitchen is huge and the orders are complicated.

Here is the breakdown of the problem and the solution, using simple analogies:

The Problem: The "Too Many Choices" Dilemma

In the real world, companies spend billions on data, but often fail to get value because they can't figure out how to move data efficiently.

  • The Old Way: Usually, a human engineer has to draw a map of exactly how every step should happen (e.g., "Take data from Site A, send it to Site B, compress it, then send to Site C"). This is like asking a chef to write down every single step of a recipe before they even know what ingredients are available. It's rigid, hard to change, and often leads to traffic jams (latency) or huge bills (cost).
  • The Challenge: You have to decide:
    1. Where to cook the data (Cloud? Edge? Fog?).
    2. How to move it (Direct highway? A two-hop detour?).
    3. When to do it (Does it need to be instant, or can it wait?).
    4. What format it needs to be in.

Doing this math for a complex network is like trying to solve a Rubik's Cube while blindfolded. It's computationally impossible for humans to do perfectly.

The Solution: WORKSWORLD (The "Magic Recipe Book")

The authors created a new system called WORKSWORLD. Think of it as a universal translator and a master chef rolled into one.

  1. The Input (The YAML Config):
    Instead of drawing a complex map, the human engineer just writes a simple shopping list in a file called YAML.

    • Analogy: You tell the AI: "I have apples at the farm (Source). I need apple pie at the bakery (Destination). I have ovens in three different cities. I want it done in under 10 minutes and under $50."
    • You don't tell the AI how to do it. You just tell it what you want.
  2. The Translator (YAML to PDDL):
    The system takes your simple shopping list and translates it into a strict, mathematical language called PDDL (Planning Domain Definition Language). This is the language the AI Conductor speaks.

  3. The Brain (The Numeric Planner):
    The AI (specifically a tool called ENHSP) looks at the "shopping list" and the "map of the world" (the resources). It runs a massive simulation to find the best path.

    • It asks: "Should I bake the pie at the farm and ship it? Or ship the apples to the city and bake there?"
    • It calculates: "If I bake at the farm, the shipping cost is low, but the oven is slow. If I bake in the city, the oven is fast, but shipping the apples costs more."
    • It finds the perfect balance between cost and speed.
  4. The Output (The Plan):
    The AI produces a step-by-step instruction manual: "Go to Farm A, pick up apples. Send them to City B. Bake them. Send the pie to the Bakery." It builds the "pipeline" automatically.

Why is this a Big Deal? (The "Aha!" Moment)

Most existing tools are like specialized chefs who only know how to cook one specific dish on one specific stove. If you change the stove or the dish, they break.

WORKSWORLD is different because:

  • It's Vendor-Agnostic: It doesn't care if you use Amazon, Google, or a local server farm. It just sees "Resources."
  • It's "Numeric": It understands numbers like "Bandwidth," "Latency," and "Cost" in real-time, not just "Yes/No" logic.
  • It Scales: The authors tested it on a computer you could buy at a store (commodity hardware). They successfully planned a pipeline with 14 steps across 8 different locations in just one hour.

The Real-World Examples

The paper gives three scenarios to show why this matters:

  1. The Archivist (Slow & Cheap): You have old videos. You can compress them at the source (saving transfer costs) or send them raw to the cloud to compress (saving local storage). The AI picks the cheapest option.
  2. The Wildfire Sensor (Fast & Critical): Sensors see smoke. They need to process the image immediately to trigger an alarm. The AI decides to process the data right at the sensor (Edge) because sending it to the cloud takes too long.
  3. The Cybersecurity Guard (Instant & Strict): A hacker tries to break in. The system must block them in a split second. The AI places the security check right next to the firewall, even if it costs a bit more, because speed is the only thing that matters.

The Catch (Limitations)

The AI is smart, but it's not magic yet.

  • The "Pre-Processing" Bottleneck: Before the AI starts solving the puzzle, it has to read the entire map of the world. If the map gets too huge (too many sites and links), this reading step takes a long time. It's like trying to read a whole encyclopedia before you can answer a single question.
  • Linear Chains: Right now, the AI is best at "assembly lines" (Step A -> Step B -> Step C). It struggles a bit with complex, branching webs where data splits and merges in crazy ways.

The Bottom Line

WORKSWORLD is a new tool that lets companies say, "Here is my data and my goal," and lets an AI figure out the most efficient, cost-effective way to move and process that data across the globe. It turns a nightmare of complex math into a simple "set it and forget it" workflow, making AI and data pipelines accessible to everyone, not just super-computer experts.