deluxpore: a Nextflow pipeline for demultiplexing Illumina dual-indexed Nanopore libraries

The paper introduces **deluxpore**, a Nextflow pipeline that enables accurate demultiplexing of Nanopore reads from Illumina dual-indexed libraries by addressing challenges like residual adapters and high error rates, thereby facilitating reliable hybrid target-capture metagenomics workflows.

Original authors: Arnaiz del Pozo, C., Sanchis-Lopez, C., Huerta-Cepas, J.

Published 2026-03-30
📖 4 min read☕ Coffee break read
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you are running a massive, high-speed mail sorting facility.

The Problem: The "Noisy" Mail

In the world of DNA sequencing, scientists often want to find very rare, specific "letters" (genes) hidden inside a huge pile of "junk mail" (the rest of the genome). To do this, they use a technique called Target Capture, which is like using a magnet to pull out only the specific letters they care about.

Usually, this magnet works best with short, crisp letters (Illumina sequencing). But scientists also want to use Nanopore sequencing, which is like a super-fast, long-distance courier. It can read entire, long letters in one go, which is great for understanding the full story. However, Nanopore couriers are a bit "noisy" and make mistakes (typos) much more often than the standard short-read couriers.

The Conflict:

  1. The Magnet: Needs letters to have specific, clean "address labels" (indexes) to know which pile they belong to.
  2. The Courier: Delivers the letters, but the address labels are often smudged, torn, or buried under extra wrapping paper (adapter fragments) because the courier is so fast and error-prone.

Standard sorting software is like a robot that expects perfect, crisp labels. If the label is smudged or the wrapping paper is in the way, the robot throws the letter in the "trash" (unassigned data). This meant scientists couldn't use the powerful long-read couriers for their targeted magnet experiments.

The Solution: deluxpore (The Smart Sorter)

The authors of this paper built a new tool called deluxpore. Think of it as a super-smart, patient human sorter who can handle messy mail.

Instead of just looking for a perfect match, deluxpore uses two clever tricks:

  1. The "Fuzzy Match" (BLAST & Levenshtein Distance): If a label says "New York" but the courier wrote "Nw Yrok," a normal robot says, "Wrong!" and throws it away. deluxpore says, "Ah, that's close enough to New York. Let's count the typos and figure it out." It can handle the smudges and missing letters.
  2. The "Wrapper" Awareness: It knows that sometimes the address label is stuck under a piece of tape (the adapter). It knows how to look under the tape to find the real address.

The Experiment: Testing the Sorter

The team tested this new sorter with two scenarios:

Scenario A: The "Combinatorial" Pile (96 Samples)
Imagine they tried to sort 96 different people's mail by giving them labels that were combinations of two colors (e.g., Red-Blue, Red-Green).

  • The Result: It was chaotic. If the "Red" part of the label was smudged, the sorter couldn't tell if it was "Red-Blue" or "Red-Green." Even with high-quality mail, they only recovered about 46% of the letters. It was too confusing.

Scenario B: The "Unique" Pile (8 Samples)
They realized that if they gave each person a completely unique, one-of-a-kind label (no shared colors), the sorter could work much better. Even if one part of the label was smudged, the other unique part was enough to identify the owner.

  • The Result: With high-quality mail, they recovered 91.7% of the letters. With the best quality, they hit nearly 98%.

The Big Takeaway

The paper teaches us two main lessons for the future of DNA research:

  1. Quality Matters: You need the mail to be reasonably clean (a quality score of Q20 or higher). If the letters are too messy, even the smartest sorter can't help.
  2. Design Matters More: It's better to use a simple, unique labeling system (one unique label per person) rather than a complex, shared system. By avoiding "confusing" label pairs, scientists can get almost perfect results.

In short: deluxpore is the bridge that finally lets scientists use the fast, long-read Nanopore couriers for their targeted DNA experiments, provided they use a smart labeling strategy and decent quality mail. It turns a "broken" workflow into a reliable, automated system.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →