Community needs for FAIR pathogen data

This study by the Pathogen Data Network identifies systemic barriers and specific training priorities among infectious disease stakeholders, revealing that limited funding, data aggregation challenges, and a need for bioinformatics education are the primary impediments to achieving FAIR pathogen data, thereby providing an evidence-based roadmap for community-responsive support and infrastructure development.

van Geest, G., Thomas-Lopez, D., Feitzinger, A. A., Weissgold, L. A., Halabi, S., Cuesta, I., Hjerde, E., Gurwitz, K. T., Arora, N., Neves, A., Palagi, P. M., Williams, J. J.

Published 2026-04-15
📖 5 min read🧠 Deep dive
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine a massive, global library dedicated to the world's most dangerous germs—viruses, bacteria, and parasites. This library holds the blueprints to every disease we've ever faced. If we could read these blueprints perfectly and share them instantly, we could stop outbreaks before they start and cure diseases faster.

However, right now, this library is a bit of a mess. The books are scattered in different rooms, some are written in code no one understands, and many are locked behind paywalls or missing pages. This is the problem the Pathogen Data Network (PDN) is trying to solve. They want to make this data FAIR: Findable, Accessible, Interoperable (works together), and Reusable.

To figure out how to fix the mess, the PDN team asked 136 experts (scientists, doctors, teachers, and data wizards) a simple question: "What's stopping you from using this data effectively?"

Here is what they found, explained through some everyday analogies:

1. The Big Three Roadblocks

The experts didn't say the problem was that the library was too small or the books were too hard to read. Instead, they pointed to three huge, structural issues:

  • The Empty Wallet (74%): The biggest problem is money. Imagine trying to build a highway to connect all the library rooms, but you keep running out of gas. Scientists need funding to buy computers, pay for software subscriptions, and hire people to organize the data. Without the budget, the project stalls.
  • The Tower of Babel (68%): The second issue is data silos. Imagine every country and hospital has its own version of the library, but they all use different filing systems. One group files by color, another by size, and another by the smell of the book. Trying to combine them is a nightmare. The data exists, but it's stuck in isolated boxes that don't talk to each other.
  • The Missing Mechanics (52%): The third issue is a shortage of skilled people. You can have the best library in the world, but if you don't have librarians who know how to catalog the books or mechanics who can fix the shelves, the system fails. There just aren't enough people trained to handle this complex data.

2. What They Need to Learn

The survey asked, "If you could take a class to fix these problems, what would you study?"

  • The Top Request: Bioinformatics for Infectious Disease. Think of this as learning the "universal translator" for germs. Scientists want to learn how to use computers to decode the genetic language of viruses so they can spot patterns and predict outbreaks.
  • The "How-To" Guide: Many people wanted training on the Pathogens Portal. This is the PDN's new "central hub"—a single website where all the data and tools live. People want a user manual on how to navigate it without getting lost.
  • The Difference Between Teachers and Researchers:
    • Researchers wanted "advanced tools." They asked for classes on Machine Learning and AI—basically, teaching computers to do the heavy lifting of finding patterns in the data.
    • Teachers/Educators wanted "real-world stories." They asked for case studies. They wanted to see how data saved a life in a specific outbreak so they could teach that story to their students.

3. How They Want to Learn

When asked how they wanted to learn these skills, the answer was clear: Keep it flexible.

  • Virtual Short Courses (68%) and Webinars (66%) were the most popular.
  • Why? Scientists are busy. They don't have time to fly to a conference for a week. They want to learn in 30-minute chunks while they are at their desks, just like watching a quick tutorial video on how to fix a leaky faucet.

4. The Most Important Tool

When asked, "What is the single most helpful thing the PDN can give you?" the answer was the Pathogens Portal.

  • The Analogy: Imagine if all the scattered library rooms were finally connected by a giant, magical elevator that takes you directly to the right book. That is what this portal is. 72% of people said this central hub is the most essential thing they have.

The Bottom Line

The paper concludes that the problem isn't that the technology is too hard or the data is too scary. The problem is systemic.

It's like trying to build a skyscraper when you have the blueprints but no money for bricks, no crane to lift them, and no team to assemble them. The technology (the FAIR principles) is ready, but the infrastructure and the people to support it are lagging behind.

The Takeaway: To stop future pandemics, we don't just need better computers; we need better funding, better ways to connect different databases, and a massive training program to turn more scientists into data experts. The PDN is now using this survey to build that support system, ensuring that when the next germ threatens us, the library is open, organized, and ready for everyone to use.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →