OSF: On Pre-training and Scaling of Sleep Foundation Models

This paper introduces OSF, a family of state-of-the-art sleep foundation models trained on a massive 166,500-hour corpus and evaluated via the open-source SleepBench, which demonstrates that channel-invariant pre-training and systematic scaling of data, capacity, and diversity significantly enhance generalization across diverse sleep and disease prediction tasks.

Zitao Shuai, Zongzhe Xu, David Yang, Wei Wang, Yuzhe Yang

Published 2026-03-03
📖 5 min read🧠 Deep dive

Imagine you want to teach a computer to understand human sleep. To do this, you need to show it millions of nights of sleep recordings. These recordings, called Polysomnography (PSG), are like a symphony of biological signals: brain waves, heartbeats, breathing patterns, and muscle movements.

However, there's a big problem. Just like every orchestra has different instruments, every sleep study uses different equipment. Some hospitals record 12 channels of data; some home devices only record 2. Sometimes sensors fall off in the middle of the night. If you train a computer to only understand a full orchestra, it will be confused when it hears just a solo violin.

This paper introduces OSF (Open Sleep Foundation Model), a new, super-smart AI designed to understand sleep no matter what "instruments" are playing. Here is the story of how they built it, explained simply.

1. The Problem: The "Missing Instrument" Crisis

The researchers found that previous sleep AI models were like musicians who could only play if every single instrument in the orchestra was present. If you took away the brain sensors (the violins), the model couldn't tell if someone was asleep or awake. If you took away the breathing sensors (the drums), it couldn't detect sleep apnea.

In the real world, data is messy. Sensors break, and home devices are simpler. The old models failed when faced with these "missing instruments."

2. The Solution: Building a Massive Library (SleepBench)

To fix this, the team didn't just look at one dataset. They built SleepBench, a massive digital library containing 166,500 hours of sleep recordings from nine different public sources.

Think of this as gathering recordings from 21,000 different people, using 9 different types of recording equipment, covering all kinds of ages and health conditions. This gave them a diverse "training camp" to teach the AI to be flexible.

3. The Three Big Discoveries

By testing different ways to train the AI, they found three golden rules for building a sleep expert:

  • Rule #1: Don't rely on a single instrument.
    They discovered that if you train the AI to expect all sensors to be there, it fails when they aren't. The AI needs to learn that the meaning of sleep is the same, whether it's hearing a full orchestra or just a few instruments.
  • Rule #2: Teach the AI to be "Channel-Invariant."
    This is the secret sauce. They taught the AI by randomly "muting" different sensors during training. Imagine playing a song and telling the AI, "Okay, pretend the drums are gone. What does the music sound like now?" or "Pretend the violins are gone."
    By forcing the AI to learn the essence of sleep regardless of which sensors are active, it became incredibly robust. It learned that a sleeping brain looks a certain way, even if the breathing sensor is missing.
  • Rule #3: Bigger is better (if you do it right).
    They found that simply feeding the AI more data and giving it a bigger brain (more computing power) made it smarter. But this only worked because of Rule #2. If you just give a rigid AI more data, it hits a wall. If you give a flexible AI (trained with the "mute" technique) more data, it keeps getting smarter and smarter.

4. The Result: OSF

The result is OSF, a new "Foundation Model" for sleep. Think of OSF as a master sleep detective who has studied every type of recording device imaginable.

  • It's a chameleon: It works perfectly on hospital-grade machines with 12 sensors, and it still works well on simple home headbands with only 2 sensors.
  • It's a doctor: It doesn't just tell you if you are asleep; it can predict if you have heart disease, diabetes, or high blood pressure based on your sleep patterns.
  • It's efficient: It learns faster and needs fewer examples to get good at a new task compared to older models.

The Analogy: The Universal Translator

Imagine you are trying to learn a language.

  • Old Models were like students who only learned to speak English if they had a dictionary, a grammar book, and a native speaker standing right next to them. If you took away the dictionary, they froze.
  • OSF is like a polyglot who learned the language by listening to radio, watching TV, talking to strangers, and reading street signs. If you take away the radio, they can still understand the street signs. They learned the concept of the language, not just the specific tools used to teach it.

Why This Matters

This is a huge step forward for health. It means we can finally use simple, cheap, home-based sleep trackers to diagnose serious diseases with the same accuracy as expensive hospital tests. It makes high-quality sleep medicine accessible to everyone, not just those with access to a full hospital lab.

In short: The researchers taught a computer to understand sleep by making it practice with "broken" data, resulting in a model that is tougher, smarter, and ready for the real world.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →