Deep phenotyping of blood cell data reveals novel clinical biomarkers

This study demonstrates that applying AI techniques, specifically clustering and self-supervised autoencoders, to raw single-cell blood count data can uncover novel, clinically prognostic biomarkers that capture subtle physiologic signals and predict outcomes like mortality and disease development more effectively than traditional summary metrics.

Chen, Y.-L., Zhang, C., Lucas, F., Hadlock, J., Foy, B. H.

Published 2026-03-26
📖 4 min read☕ Coffee break read
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine your blood is a bustling city, and the Complete Blood Count (CBC) test is the daily newspaper that the hospital reads to check on the city's health.

For decades, this newspaper has only reported the "headlines": How many people are there? What is the average size of a house? How many police officers are on patrol? These are the standard numbers doctors look at (like total white blood cell count or average red blood cell size).

The Problem:
The authors of this paper realized that while the "newspaper" gives us the headlines, the actual raw data collected by the machines is like a high-definition, 4K video recording of every single person walking down the street. The machine sees thousands of tiny details about every single cell—how bumpy it is, how much light it reflects, how it's moving. But until now, doctors have been throwing away this rich video footage and only reading the summary headlines. They were missing subtle shifts in the crowd that could signal trouble before a crisis happens.

The Solution: Two New Ways to Watch the Video
The researchers, led by Dr. Brody Foy, decided to use Artificial Intelligence (AI) to re-watch this raw video footage and find new, hidden stories. They used two different "lenses" to do this:

  1. The "Organized Crowd" Lens (Clustering):
    Imagine sorting the crowd in the video into specific groups: the firefighters, the teachers, the construction workers. Once sorted, instead of just counting them, the AI looks at the details of each group.

    • The Analogy: Instead of just saying "There are 100 firefighters," the AI says, "The firefighters are unusually jittery," or "The smallest firefighters are getting much smaller."
    • The Result: They found that looking at the variance (how different the cells are from each other) and the extremes (the very smallest or largest cells) was a powerful predictor of future illness. It's like noticing that the smallest monocytes (a type of white blood cell) are shrinking, which predicts heart trouble years before a heart attack.
  2. The "Pattern Detective" Lens (Autoencoders):
    This is the more mysterious AI. Imagine a detective who doesn't care about specific groups but looks for complex, invisible patterns connecting everyone in the crowd. Maybe the way the construction workers are standing relates to how the police are moving, even though they aren't talking to each other.

    • The Analogy: This AI finds "secret codes" in the data that humans can't easily explain. It captures non-linear relationships—like a complex dance between different cell types that signals stress in the body.
    • The Result: These "secret codes" turned out to be incredibly good at predicting who would get sick, who would need to be admitted to the hospital, or who might develop cancer, often better than the standard headlines.

What Did They Discover?
By applying these AI lenses to over 240,000 blood tests, they found:

  • New Early Warning Systems: They discovered hundreds of new "biomarkers" (warning signs). For example, they found that the spread of sizes in neutrophils (a type of white blood cell) is a strong predictor of death or hospital admission.
  • Hidden Connections: The "Pattern Detective" AI found that these blood cell patterns were secretly linked to other things in the body, like specific infections (like HIV or CMV), hormone levels, and even how well your blood clots. It's like realizing that the way the "firefighters" are walking in your blood is actually a sign that your "plumbing" (coagulation) is stressed.
  • Better Than the Headlines: Even after adjusting for age, sex, and the standard blood test numbers, these new AI-derived markers still provided strong predictions. They added a whole new layer of information that was previously invisible.

Why Does This Matter?
Think of it like upgrading from a black-and-white radio to a surround-sound 3D movie.

  • Current Medicine: We listen to the radio (standard blood counts). It tells us if there's a storm, but only after the wind starts blowing.
  • This Study: We are now watching the 3D movie. We can see the clouds gathering and the pressure dropping before the storm hits.

The Bottom Line:
This paper proves that we have been sitting on a goldmine of data for years. By using modern AI to dig deeper into the raw, single-cell data from routine blood tests, we can create new, highly accurate tools to predict disease, catch problems earlier, and potentially save lives—all without needing to invent new, expensive tests. We just needed better software to read the data we already have.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →