Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process Workloads

This paper introduces the User-space Scheduling Framework (USF) and its default cooperative policy, SCHED_COOP, which leverage user-space thread scheduling via the nOS-V runtime to eliminate OS-level preemption interference under oversubscription, achieving up to 2.4x performance gains in complex multi-runtime and multi-process HPC and AI workloads without requiring invasive application changes.

Original authors: Aleix Roca, Vicenç Beltran

Published 2026-01-29
📖 5 min read🧠 Deep dive

Original authors: Aleix Roca, Vicenç Beltran

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine a busy kitchen in a high-end restaurant. This kitchen represents a powerful computer processor with many chefs (cores) ready to cook.

The Problem: Too Many Orders, Too Many Chefs

In the world of High-Performance Computing (HPC) and Artificial Intelligence (AI), applications are becoming like massive, complex recipes that require many chefs working at once. Sometimes, the restaurant gets so busy that there are more cooks ready to work than there are stoves (cores) available. This is called oversubscription.

Traditionally, the "Head Chef" (the Operating System's scheduler) manages this chaos by forcing every cook to take turns. Every few seconds, the Head Chef yells, "Stop what you're doing! Switch to the next cook!" This is called preemption.

While this ensures everyone gets a turn, it causes two major problems in a high-stakes kitchen:

  1. The "Lock-Holder" Problem: Imagine a cook is holding a heavy pot (a lock) and waiting for another cook to finish chopping onions. If the Head Chef yells "Stop!" and takes the pot away from the first cook to give it to someone else, the second cook is now stuck waiting for a pot that isn't being held by anyone. The whole line grinds to a halt.
  2. The "Switching" Cost: Constantly stopping and starting cooks wastes time. It's like a chef dropping a knife, walking to the other side of the kitchen, picking up a new knife, and starting over. This wastes energy and slows down the meal service.

The Solution: A New Way to Manage the Kitchen

The authors of this paper, Aleix Roca and Vicenç Beltran, built a new system called USF (User-Space Scheduling Framework). Think of this as a new, specialized manager who works inside the kitchen team rather than being the distant Head Chef in the office.

Instead of the OS forcing random switches, USF lets the cooks manage themselves using a rule called SCHED_COOP.

How SCHED_COOP Works:

  • No Interrupts: Once a cook starts a task, they are allowed to keep working until they naturally hit a pause point (like waiting for an ingredient or finishing a step). The manager never yells "Stop!" in the middle of a task.
  • Smart Swapping: The manager only swaps cooks when one of them voluntarily stops to wait (blocks). If Cook A is waiting for onions, the manager immediately hands the stove to Cook B, who is ready to cook.
  • The "Cooperative" Aspect: The cooks agree to this system. They know that if they get stuck waiting, they will step aside so someone else can work. This prevents the "stuck pot" problem and eliminates the wasted time of constant switching.

The Magic Ingredient: The "Glibc" Extension

The clever part of this research is how they built it. Usually, changing how a computer schedules tasks requires rewriting the computer's core operating system (the "kernel"), which is like rebuilding the entire restaurant's plumbing and electrical system. That's hard, dangerous, and requires special permissions.

Instead, the authors modified the GNU C Library (glibc).

  • The Analogy: Think of glibc as the universal "uniform" and "instruction manual" that almost every computer program wears and reads. By slightly altering this uniform, the authors made the cooks (threads) automatically follow the new SCHED_COOP rules without the restaurant owner (the user) having to rewrite the recipes (the applications).
  • Seamless: Because they changed the uniform, existing applications (like OpenMP, PyTorch, or molecular dynamics simulations) just put on the new uniform and start working better immediately. No invasive surgery on the code was needed.

The Results: Faster Service

The authors tested this in several scenarios, acting like a restaurant manager running different simulations:

  1. Nested Runtimes: Imagine a recipe that calls for a sub-recipe, which calls for another sub-recipe. This creates a chaotic mix of many cooks. The new system handled this mix much better than the old "stop-and-start" method, speeding up performance by up to 2.4 times in some cases.
  2. AI Inference (LLaMA-3): They ran multiple AI models at once. The new system allowed them to run smoothly together without the "noise" of constant switching, keeping the service fast even when the kitchen was packed.
  3. Molecular Dynamics: Simulating how molecules move is like a complex dance. The new system allowed multiple dances to happen on the same floor without the dancers tripping over each other, utilizing the kitchen's resources more efficiently.

The Catch

The system works perfectly as long as the cooks follow the rules. However, some older recipes use a technique called "busy-waiting," where a cook stands at the stove staring at a pot, refusing to step aside even if they are just waiting. The new system can't force these specific cooks to move unless the recipe is slightly tweaked to tell them, "If you're waiting, take a quick breath and let someone else work." The authors found this tweak easy to do.

Summary

In short, this paper presents a way to let computer programs manage their own busy schedules without needing the operating system to constantly interrupt them. By changing the "uniform" (glibc) rather than the "building" (kernel), they created a system that is easier to use, requires no special permissions, and makes complex, overloaded computer tasks run significantly faster and smoother.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →