Uncovering identifiability of epidemiological models: basic reproduction number and complementary data streams

This study demonstrates that while individual parameters in epidemiological models may not be uniquely identifiable, the basic reproduction number often is, and that adding minimal complementary data can render otherwise non-identifiable models globally identifiable, thereby shifting the focus of public health surveillance toward ensuring the identifiability of decision-relevant quantities rather than complete model identifiability.

Original authors: Pant, B., Saucedo, O., Pogudin, G.

Published 2026-01-25
📖 5 min read🧠 Deep dive

Original authors: Pant, B., Saucedo, O., Pogudin, G.

Original paper licensed under CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/). ⚕️ This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you are a detective trying to solve a mystery: How fast is a disease spreading, and how dangerous is it?

To solve this, you build a mathematical "simulation" of the disease. You feed it data you can see, like the number of people reporting to the hospital each day. But here's the catch: the data you see is just the tip of the iceberg. You can't see everyone who is sick, how many are immune, or exactly how many people are walking around carrying the virus.

This paper asks a fundamental question: Even if we had perfect, crystal-clear data (no errors, no missing numbers), could our mathematical model actually figure out the true rules of the disease?

In the world of math, this is called structural identifiability. If a model isn't "identifiable," it's like a lock with two different keys that open it perfectly. You can't tell which key is the "real" one, so you can't know the true rules of the disease.

Here is what the authors discovered, explained through simple analogies:

1. The "Blind Spot" Problem

Usually, scientists assume that if they can't figure out every single number in their model (like the exact speed of transmission or the exact population size), the whole model is useless.

The authors say: Not so fast.

Think of a recipe for a cake. If you only taste the frosting, you might not know exactly how much sugar or flour went into the batter. You can't identify the individual ingredients. However, you can still know exactly how sweet the cake is.

The paper shows that even when a model is "broken" and can't tell you the exact values of every single parameter, it can often still tell you the Basic Reproduction Number (R0R_0).

  • What is R0R_0? It's the average number of people one sick person infects. It's the most important number for deciding if an outbreak will explode or die out.
  • The Finding: In almost every type of disease model they tested (from simple flu models to complex mosquito-borne diseases), the model could correctly identify R0R_0 even if it couldn't figure out the individual ingredients (like the exact transmission rate or population size).
  • The Takeaway: You don't need to solve the entire puzzle to know if the fire is going to spread. You just need to know the "spread factor."

2. The "One Clue" Miracle

What happens if the model is stuck and can't figure out the spread factor? The authors found a surprisingly simple fix.

Imagine you are trying to guess the height of a tree, but you only have a blurry photo of its shadow. You can't tell the height. But if someone hands you one single, perfect measurement of the tree's trunk at a specific moment, you can instantly calculate the exact height.

The paper proves that for many complex models, adding just one single data point from a different source can unlock the whole mystery.

  • Example: If you only track daily deaths, the model might be confused about whether the disease is highly contagious but mild, or not very contagious but deadly.
  • The Fix: If you add one single measurement of how many people are currently sick (or recovered) at a specific time, the model suddenly becomes "globally identifiable." It can now figure out everything correctly.
  • The Takeaway: Public health officials don't need to spend millions tracking five different data streams constantly. They might get better results by spending resources to get one high-quality, perfect measurement from a different angle (like a blood test survey) at a key moment.

3. The "Shape" Matters

The authors also found that the shape of the model matters.

  • Most Models: Whether it's a simple flu model, a cholera model with water transmission, or a mosquito model, the "spread factor" (R0R_0) is usually easy to find, even if the rest is hard.
  • The Exception: There was one tricky model where people change their behavior (like wearing masks when they see many sick people). In this specific case, the "spread factor" was hard to pin down perfectly. It's like a chameleon that changes color so fast you can't get a clear picture of its true color.

Summary of the Paper's Message

The paper challenges the old way of thinking. Instead of asking, "Is our whole model perfect and identifiable?" we should ask, "Can we identify the specific number that matters for our decision?"

  • Good News: Even if the model is "broken" regarding individual details, it often still works perfectly for the most important decision-making number (R0R_0).
  • Better Strategy: If the model is stuck, you don't need more data everywhere. You just need one tiny, perfect piece of extra information from a different source to fix the whole system.

In short: You don't need to see the whole forest to know if it's on fire; sometimes, just seeing one spark is enough to tell you everything you need to know.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →