Benchmarking circRNA Detection Tools from Long-Read Sequencing Using Data-Driven and Flexible Simulation Framework

This study introduces a novel, open-source simulation framework to generate realistic Oxford Nanopore long-read datasets and presents the first comprehensive benchmark comparing three circRNA detection tools (CIRI-long, IsoCIRC, and circNICK-Irs), revealing their distinct performance profiles and limited overlap to guide future tool selection and development.

Original authors: Rusakovich, A., CORRE, S., Cadieu, E., Fraboulet, R.-M., Le Bars, V., Galibert, M.-D., Derrien, T., Blum, Y.

Published 2026-03-06
📖 5 min read🧠 Deep dive
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine your cell's genetic library (DNA) is like a massive cookbook. Usually, when the cell wants to make a dish (a protein), it copies a recipe from the book, cuts out the boring parts (introns), and stitches the good parts (exons) together to make a straight, linear instruction manual. This is how most of our genes work.

But sometimes, the cell gets creative. Instead of making a straight line, it takes the ends of a recipe and glues them together to form a circle. These are called circRNAs. Think of them as a recipe looped back on itself, like a bracelet or a donut. Because they are closed loops, they are super tough and don't fall apart easily, making them very stable. Scientists think these "donuts" might hold clues to diseases like cancer, so finding them is a big deal.

The Problem: Finding the Donuts in the Haystack

To find these circular donuts, scientists use a high-tech scanner called Oxford Nanopore sequencing. Unlike older scanners that chop the DNA into tiny, unreadable crumbs, this new scanner can read the whole "donut" in one go. That's great!

But here's the catch: The software (the "detectives") trying to find these donuts in the scanner data isn't very good yet. There are three main detective tools available (CIRI-long, IsoCirc, and circNICK-LRS), but no one knew which one was the best, or if they were even looking in the right places.

The Experiment: The "Fake Donut" Factory

To test these detectives, the authors of this paper built a virtual simulation factory.

Imagine you want to test a metal detector, but you don't want to dig up a real park and risk missing a real treasure. Instead, you build a sandbox where you bury exactly 7,500 fake gold coins (circRNAs) and 117,000 fake rocks (linear RNA) in a specific pattern. You know exactly where every coin is. This is your "Ground Truth."

The authors created a computer program that:

  1. Looked at real biological data to understand what these "donuts" actually look like (how long they are, what they are made of).
  2. Used a tool called NanoSim to generate millions of fake DNA reads that looked and sounded exactly like real data from the Nanopore scanner.
  3. Ran the three detective tools on this fake data to see who found the most coins and who made the most mistakes.

The Results: The Three Detectives

Here is how the three tools performed, using our "Donut Detective" analogy:

1. IsoCirc: The "Sniper"

  • Style: Extremely precise but very picky.
  • Performance: When it said, "I found a donut!" it was almost always right (High Precision). However, it missed a huge number of donuts that were actually there (Low Recall).
  • Weakness: It has a built-in size limit. It refuses to look for "giant donuts" (long circRNAs) and mostly ignores the weird, rare types.
  • Best for: Researchers who need 100% confidence in a few specific findings and have limited computer power.

2. CIRI-long: The "Heavy Lifter"

  • Style: A balanced approach, but it's a bit clumsy and hungry.
  • Performance: It found more donuts than the Sniper, but it still missed some. It was the only one brave enough to find a specific rare type of donut called a "ciRNA."
  • Weakness: It is extremely hungry. It eats up so much computer memory (RAM) that it can crash your computer if you aren't careful.
  • Best for: Researchers who need a middle-ground solution and have powerful computers.

3. circNICK-LRS: The "Net Catcher"

  • Style: Casts a wide net and catches almost everything.
  • Performance: It found the most donuts, including the giant ones and the hard-to-find ones (High Recall). It was the most sensitive tool.
  • Weakness: Because it casts such a wide net, it sometimes catches rocks thinking they are donuts (Lower Precision). It also struggles to describe the exact shape of the donut (the internal structure). It is also very slow, like a turtle.
  • Best for: Researchers who want to find every possible donut, even if they have to double-check the results later.

The Big Takeaway

The study revealed a shocking truth: The three detectives rarely agreed with each other.

  • If you used only IsoCirc, you might miss 90% of the donuts.
  • If you used only circNICK-LRS, you might have a lot of false alarms.
  • The Solution: The authors suggest that to get the full picture, you shouldn't rely on just one tool. You should run all three and combine their results. It's like using a metal detector, a shovel, and a magnet together to find all the treasure.

Why This Matters

Before this study, scientists were guessing which software to use. Now, they have a clear map.

  • If you want speed and accuracy for a few specific targets, use IsoCirc.
  • If you want to find everything and don't mind a slow process, use circNICK-LRS.
  • If you want a balance, use CIRI-long.

The authors also made their "Fake Donut Factory" (the simulation code) free for everyone to use. This means other scientists can now test their own new tools against this perfect standard, helping to build better software for understanding these mysterious genetic "donuts" that could one day help cure diseases.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →