📊 epidemiology

Methods for Reproducible Comparison of Strategies in Stochastic Modelling

This paper demonstrates how hash-based matching and pseudo-random number generation methods, specifically the Bernoulli hashing approach, enable efficient and reproducible comparisons of stochastic simulation strategies across varying model complexities while effectively addressing counterfactual scenarios.

Original authors: Sunnucks, R., Davis, E. L., Rock, K. S.

Published 2026-05-01

📖 7 min read🧠 Deep dive

CC BY 4.0

Original authors: Sunnucks, R., Davis, E. L., Rock, K. S.

Original paper licensed under CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/). ⚕️ This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you are a policymaker trying to decide between two different strategies to stop a disease, like Strategy A (a new vaccine) and Strategy B (doing nothing). You have a computer model that simulates how the disease spreads. Because real life is messy and unpredictable, your model uses "stochastic" (random) simulations. It's like rolling dice to decide who gets sick next.

The problem is that when you run the model for Strategy A and then run it again for Strategy B, the "dice rolls" are totally different every time. It's like comparing two different weather forecasts where one predicts rain because the computer rolled a 3, and the other predicts sun because it rolled a 6. You can't tell if the difference in results is because the strategy is actually better, or just because the random dice rolls happened to be unlucky for one of them. This "noise" makes it hard to know which strategy is truly the winner.

This paper introduces a clever way to fix that noise so you can compare strategies fairly.

The Core Idea: The "Parallel Universe" Trick

The authors propose a method called Hash-Based Matching. Think of it like this:

Imagine you are testing two different cars (Strategy A and Strategy B) on a race track.

The Old Way (Regular Stochastic): You drive Car A on a sunny day with a tailwind, and Car B on a rainy day with a headwind. If Car A wins, you don't know if it's because the car is better or because the weather was nicer.
The New Way (Hash-Based): You drive both cars on the exact same day, on the exact same track, with the exact same wind. The only thing that changes is the car itself.

In the computer model, the "weather" is the random number generation. The authors use a mathematical tool called a Hash Function to act as a "time machine" or a "shared reality."

Here is how it works in simple terms:

The Salt: They give every simulation run a unique "salt" (like a secret ID number).
The Hash: Before the computer rolls the dice for any event (like a person getting infected), it looks at the current time, the event type, and the secret ID. It runs these through a "hash machine" to create a specific seed.
The Result: Because the inputs are the same for both strategies at the same moment in time, the "dice rolls" come out the same. If 5 people get infected in Strategy A, the model ensures that the underlying randomness would have caused 5 people to get infected in Strategy B if the conditions were the same.

This allows the model to see the true difference between the strategies, stripping away the confusion caused by random luck.

The Three Methods Proposed

The paper suggests three specific ways to do this, depending on how complex your model is:

1. The Default Hashing Method (The "Proportional" Approach)

How it works: It uses the standard random number generator but resets the seed using the hash function before every event.
The Analogy: Imagine two buckets of water. If you pour water into Bucket A, the hash method ensures that if Bucket B has twice as much water, it gets exactly twice as much "random splash."
Pros/Cons: It's fast and easy to use. However, it has a slight quirk: it assumes the randomness scales perfectly with the number of people. It's like saying if you have 100 people, the "bad luck" is exactly 100 times worse than if you have 1 person. This is usually fine, but not perfectly realistic for every single individual.

2. The Bernoulli Hashing Method (The "Individual" Approach)

How it works: Instead of rolling one big dice for the whole group, it rolls a tiny coin flip for every single person in the model to see if they get infected.
The Analogy: Instead of guessing how many people in a crowd will catch a cold, you walk up to every single person and ask, "Did you catch it?" using the same coin flip logic for both strategies.
Pros/Cons: This is the most accurate because it treats every person as an individual. However, it is very slow. If you have a city of 1 million people, the computer has to flip a coin 1 million times for every single step of the simulation. It's like trying to count every grain of sand on a beach one by one.

3. The Truncated Bernoulli Method (The "Smart Shortcut")

How it works: This is a compromise. It knows that in most cases, only a few people will get sick at once. So, instead of flipping a coin for everyone, it only flips coins for the "likely" few, and skips the rest.
The Analogy: Imagine a lottery with 1 million tickets, but you know only 5 people will win. Instead of checking all 1 million tickets, you use a smart trick to only check the 5 tickets that have a chance of winning.
Pros/Cons: It's much faster than the full Bernoulli method but still very accurate for diseases that spread slowly. It's the "Goldilocks" solution for complex models.

What They Found (The Results)

The authors tested these methods on two models:

A Simple Model (SEIRV): A basic model of a vaccine-preventable disease.
- Result: The new hashing methods were much clearer. The "noise" disappeared. They could clearly see that the vaccine worked, whereas the old methods sometimes made it look like the vaccine was useless or even harmful just because of random bad luck in the simulation.
A Complex Model (gHAT): A detailed model of African Sleeping Sickness, which involves flies, humans, and different interventions.
- Result: The "Truncated Bernoulli" method was the winner here. It allowed them to compare strategies (like active screening vs. vector control) without the random noise confusing the results. They could confidently say, "Strategy X is better," without worrying that the computer just rolled the dice poorly.

Why This Matters

The paper argues that without these methods, policymakers might make bad decisions.

The Risk: If the random noise makes a good strategy look bad, a policymaker might reject a life-saving vaccine.
The Benefit: By using these "parallel universe" hashing methods, the comparison becomes fair. You are comparing the strategy, not the luck.

Summary

The paper doesn't claim to cure diseases or invent new vaccines. It simply provides a better ruler for measuring how well different strategies work in computer models. It ensures that when scientists say "Strategy A is better than Strategy B," they actually mean it, and not just that they got lucky with the dice rolls.

Simple models: Use the Bernoulli method for maximum accuracy.
Complex models: Use the Truncated Bernoulli method for a balance of speed and accuracy.
General use: The Default Hashing method is a solid, fast option for most situations.

The authors emphasize that these methods are specifically for tau-leaping simulations (a common way to run disease models) and are designed to make the "counterfactual" (what would have happened if we did something else) much clearer and less noisy.

1. Problem Statement

Stochastic simulations are essential for modeling real-world phenomena like infectious disease dynamics because they capture uncertainty and produce discrete integer outputs (crucial for modeling extinction events). However, a significant challenge arises when comparing different intervention strategies (e.g., Strategy A vs. Strategy B) using these models.

The Core Issue: In standard stochastic simulations, the "noise" introduced by random number generation (RNG) is independent between different strategy runs. When comparing two strategies, this independence creates statistical noise that obscures the true difference between them.
The Consequence: Policymakers may incorrectly conclude that a superior strategy is inferior (or vice versa) due to random variance rather than actual model dynamics. This is particularly problematic when calculating metrics like the probability that one strategy is better than another, or when evaluating counterfactual scenarios (e.g., "What would have happened if we intervened earlier?").
Limitations of Existing Solutions:
- Seeded RNG: Setting the same initial seed for different strategies fails because the simulation paths diverge immediately, breaking the dependency between the "same reality" scenarios.
- Perfect Counterfactuals (e.g., Kaminsky et al.): These methods track every individual to ensure perfect alignment but are computationally prohibitive (requiring massive RAM and time) and often incompatible with standard compartmental models.

2. Methodology

The authors propose a suite of hash-based pseudo-random number generation (PRNG) methods. These methods ensure that when two simulations (strategies) encounter the same "event" (defined by time, state, and event type), they generate the same random outcome, thereby creating a statistical dependence (coupling) between the realizations.

The paper builds upon the hashprng package (Pearson & Abbott) and introduces three specific approaches:

A. Default Hashing Method

Mechanism: Before drawing a random number for an event (typically from a Poisson distribution in tau-leaping algorithms), the random seed is set to the output of a hash function.
Inputs: The hash function takes the time step, a unique "salt" (identifying the specific simulation trajectory), and the event type.
Property: This ensures that if two strategies have the same number of individuals and rates at a specific time, they draw from the same percentile of the distribution.
Limitation: It exhibits "proportionality." If Strategy B has $N$ more individuals than Strategy A, the number of events in B will be roughly proportional to the extra individuals, rather than being an independent realization of the extra risk.

B. Bernoulli Hashing Method

Mechanism: Replaces the Poisson draw with a sum of Bernoulli trials. For $N$ individuals, the algorithm draws $N$ Bernoulli random variables (0 or 1) to determine if each individual undergoes the event.
Dependency: The underlying uniform random numbers for the Bernoulli draws are generated via the same hash function.
Advantage: This removes the "proportionality" issue. If Strategy A has $k$ infections, Strategy B (with more susceptibles) will have between $k$ and $k + \Delta N$ infections, ensuring consistent resolution of events (more people $\neq$ fewer events).
Drawback: Computationally expensive for large populations as it requires drawing a random number for every individual in every time step.

C. Truncated Bernoulli Hashing Method

Mechanism: A computational optimization of the Bernoulli method designed for large populations with low event rates. Instead of drawing $N$ Bernoulli variables, it draws a limited number ( $m$ ) of variables from the tail of the distribution using order statistics (Beta distribution).
Logic: Since the expected number of events is usually much smaller than the population size ( $E \ll N$ ), the algorithm only simulates the "active" portion of the distribution.
Trade-off: It is significantly faster than full Bernoulli hashing but introduces a very low probability of "inconsistent resolution" (where adding a person could theoretically cause more than $m$ events). This probability approaches zero as the time step decreases.

3. Key Contributions

Novel Algorithms: Introduction of the Bernoulli Hashing and Truncated Bernoulli Hashing methods, extending the existing hashprng framework to address proportionality and computational scalability.
Theoretical Framework: Formal definition of "consistent resolution of events" and the mathematical properties required for counterfactual comparisons in stochastic models.
Comparative Analysis: A rigorous comparison of these new methods against standard stochastic, seeded stochastic, and "perfect counterfactual" approaches.
Practical Implementation: Demonstration of how to integrate these methods into complex epidemiological models (SEIRV and gHAT) without requiring individual-based modeling (IBM).

4. Results

The authors tested their methods on two epidemiological models:

Case Study 1: SEIRV (Simple Vaccine-Preventable Infection)

Setup: Compared vaccination strategies against no intervention.
Findings:
- Variance Reduction: Both hashing methods drastically reduced the variance in "infections averted" compared to standard and seeded stochastic methods.
- Bernoulli Superiority: The Bernoulli method provided the lowest variance (best statistical coupling) while maintaining reasonable runtimes for this simple model.
- Realism: Standard and seeded methods occasionally produced "negative infections averted" (implying vaccination caused more infections), a logical impossibility. The hashing methods eliminated these artifacts.
- Performance: The hashing methods were slower than standard stochastic (2–4x), but the trade-off in accuracy was deemed necessary.

Case Study 2: gHAT (Complex African Sleeping Sickness Model)

Setup: A complex vector-borne disease model involving active screening and vector control.
Findings:
- Scalability: The full Bernoulli method was too slow (100x+). The Truncated Bernoulli method was successfully implemented, offering a balance between speed and accuracy.
- Decision Making: In cost-effectiveness analyses (Net Monetary Benefit), the hashing methods produced clearer separation between strategies. Standard methods showed high noise, making it difficult to determine the optimal strategy at different willingness-to-pay thresholds.
- Last Transmission Event (LTE): Hashing methods provided more accurate and less noisy predictions for the year of the last transmission event, a critical metric for elimination goals.

5. Significance and Implications

Policy Impact: The methods allow policymakers to make risk-averse decisions with higher confidence. By reducing the "noise" between strategies, the probability that one strategy is truly better than another can be estimated more accurately, preventing the rejection of beneficial interventions due to simulation artifacts.
Computational Efficiency: The proposed methods offer a "sweet spot" between the infeasible "perfect counterfactuals" (individual-based) and the noisy "standard stochastic" approaches. They are applicable to standard compartmental models without requiring a complete model rewrite.
Generalizability: While tested on epidemiology, the approach is applicable to any stochastic simulation where comparing counterfactual scenarios is required (e.g., ecology, economics).
Limitations: The methods are specific to tau-leaping algorithms. The Bernoulli approach remains computationally heavy for high-rate, large-population models, necessitating the use of the Truncated version, which carries a small theoretical risk of inconsistency.

Conclusion: The paper establishes that hash-based matching is a robust, computationally feasible, and statistically superior method for comparing stochastic strategies, significantly improving the reliability of evidence used in public health policy.