Single-cell hit calling in high-content imaging screens with Buscar

The paper introduces Buscar, an open-source Python method for high-content screening that improves hit calling by leveraging full single-cell heterogeneity to define morphology signatures and quantify both compound efficacy and specificity, thereby overcoming the limitations of traditional aggregate statistics.

Original authors: Serrano, E., Li, W.-s., Way, G. P.

Published 2026-04-19
📖 5 min read🧠 Deep dive
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you are a detective trying to solve a mystery in a massive city. The city represents a petri dish full of cells, and the "mystery" is figuring out which chemical drug or genetic tweak can fix a sick cell and turn it back into a healthy one.

For a long time, scientists had a blind spot. When they looked at the city, they didn't look at individual people; they just took a census average. They'd say, "The average citizen is 5 feet 6 inches tall." But what if half the city is made of giants and the other half of dwarfs? The average hides the truth. In biology, this is called "aggregation." It assumes every cell reacts the same way to a drug, which is rarely true. Some cells might get cured, some might get worse, and some might not care at all.

Enter Buscar (which means "to look for" in Spanish). It's a new tool that changes the game by letting scientists look at every single cell individually, rather than just the average.

Here is how Buscar works, using a simple analogy:

The Setup: The "Before" and "After" Photos

Imagine you have two groups of people:

  1. The Sick Group (Reference): People with a specific illness (e.g., a heart condition).
  2. The Healthy Group (Target): People who are perfectly healthy.

Scientists take photos of both groups and measure everything: height, shoe size, how fast they walk, how loud they talk, etc. These are the "morphology features."

Step 1: The "On" and "Off" Lists

Buscar looks at the photos and creates two lists:

  • The "On" List (The Symptoms): These are the traits that are different between the sick and healthy groups. Maybe sick people walk slowly and have red shoes, while healthy people walk fast and wear blue shoes. If a drug changes a sick person to walk fast and wear blue, that's good! This is Efficacy.
  • The "Off" List (The Normal Stuff): These are traits that are the same for both groups. Maybe both groups have the same hair color and eye shape. If a drug changes a sick person's hair color to neon green, that's weird! That wasn't part of the cure; it's a side effect. This is Specificity (or lack of off-target effects).

Step 2: The Drug Test

Now, you give a new drug to the sick people. Buscar takes a photo of them after the drug and compares them to the lists.

  • The "On-Buscar" Score (The Cure Score): How close do the treated people look to the "Healthy Group" regarding the "On" traits?
    • Low Score: Great! They look just like the healthy people. The drug worked.
    • High Score: Bad. They still look sick.
  • The "Off-Buscar" Score (The Side Effect Score): Did the drug mess up the "Off" traits?
    • Low Score: Great. They kept their normal hair and eye color. The drug was precise.
    • High Score: Bad. They turned neon green. The drug is too messy and might be dangerous.

Why This Matters: The Three Big Wins

The paper tested this tool on three different "cities" (datasets) to prove it works:

  1. The Heart City (Cardiac Fibroblasts):
    Scientists had heart cells from patients with heart failure. They tried a drug meant to fix the heart.

    • Old Way: They would have just said, "The average cell looks a little better."
    • Buscar Way: It saw that the drug actually fixed the "sick" cells (low On-Buscar score) but also noticed it made some healthy cells act weirdly (high Off-Buscar score). This tells scientists: "This drug works, but be careful, it has side effects."
  2. The Gene City (MitoCheck):
    They tested thousands of genes to see which ones, when turned off, caused specific cell shapes (like a cell splitting into two heads).

    • The Result: Buscar was like a smart librarian. It correctly ranked the genes. If turning off a gene made cells look like the "two-headed" target, that gene got a top rank. If it didn't, it got a low rank. It proved the tool understands biology, not just math.
  3. The Reproducibility City (CPJUMP1):
    Sometimes, if you run an experiment on Monday vs. Tuesday, or in Plate A vs. Plate B, the results change because of tiny technical differences (like the temperature of the room).

    • The Result: Buscar was rock solid. Even when the same drug was tested on different plates, it gave the same score. This means scientists can trust it to find real hits, not just lucky accidents.

The Bottom Line

Buscar is like upgrading from a blurry, black-and-white group photo to a high-definition, 4K video where you can see every person's face.

  • Old Method: "The crowd looks happy." (But maybe only half are, and the other half are crying).
  • Buscar: "Group A is cheering, Group B is crying, and Group C is confused. Let's find the drug that makes Group B cheer without making Group C cry."

By separating the "cure" from the "side effects" and looking at every single cell, Buscar helps scientists find better drugs faster and avoid the ones that might fail later in human trials. It's a smarter, more honest way to look for the next big medical breakthrough.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →