Exploring the Model Dependence of MCMC-Based 21 cm… — Plain-Language Explanation

The Big Picture: Trying to Guess the Recipe

Imagine you are trying to figure out the secret recipe for a famous cake (the Epoch of Reionization, or the early universe). You can't taste the cake directly because it's too far away and the signal is too faint. Instead, you have a "taste test" machine called 21CMMC.

This machine works by guessing ingredients (like how many stars formed or how hot they were) and then baking a "mock cake" (a computer simulation) to see if it matches the crumbs you found on the table (the real data from radio telescopes).

The problem this paper investigates is: What happens if the machine uses a slightly different baking method than the one used to make the real crumbs?

The Two Baking Methods (Bubble-Finding Algorithms)

The computer program that makes the mock cakes, called 21cmFAST, has two different ways to decide when a "bubble" of ionized gas forms around a star. Think of these as two different ways to frost a cake:

Method 1 (The "Flood" Method): When the machine decides a spot is ready to be frosted, it paints the entire area around it at once. It's like dipping a whole cookie into frosting. This is very detailed but takes a long time to compute.
Method 2 (The "Dot" Method): When the machine decides a spot is ready, it only puts a dot of frosting on the exact center pixel. It's like using a tiny brush. This is faster and is the default setting.

Both methods are supposed to create a cake that looks like the real universe, but they create slightly different textures.

The Experiment: Mixing and Matching

The authors ran a massive experiment with 100 pairs of "cakes."

The "Real" Data: They baked 100 mock universes using Method 1 (the slow, detailed one).
The "Test" Data: They baked 100 mock universes using Method 2 (the fast, default one).
The Test: They took the "Method 1" cakes and tried to use the "Method 2" machine to guess the recipe. Then, they took the "Method 2" cakes and tried to use the "Method 1" machine to guess the recipe.

They wanted to see if the machine could still correctly identify the ingredients (the Ionizing Efficiency, or how good stars are at creating bubbles, and the Minimum Temperature, or how hot stars need to be to form) even if the baking method didn't match.

The Results: The Machine Gets Confused

The results showed that the machine is very sensitive to the baking method.

When the methods matched (In-Domain): If the machine used Method 2 to guess the recipe for a Method 2 cake, it got the ingredients almost exactly right. It was a perfect match.
When the methods didn't match (Out-of-Domain):
- The "Flood" Cake, "Dot" Machine: When the machine tried to guess the recipe for a "Flood" cake using "Dot" logic, it got the Ionizing Efficiency completely wrong. It thought the stars were way more efficient at making bubbles than they actually were. It was like trying to guess a recipe for a sponge cake using a machine designed for a dense fruitcake; it just couldn't figure out the texture.
- The "Dot" Cake, "Flood" Machine: Conversely, when trying to guess a "Dot" cake with "Flood" logic, it underestimated the efficiency.

The Analogy: Imagine you are trying to guess how much sugar is in a smoothie.

If you taste a smoothie made with a blender (Method 1) but your taste buds are calibrated for a milkshake (Method 2), you might think the smoothie has way too much sugar just because the texture is different, even if the sugar amount is the same. The machine confuses the texture of the data with the ingredients.

Does Noise Make it Better?

The authors wondered, "What if we add static or noise to the data, like real radio telescopes have?" Maybe the noise would hide the small differences between the two methods, making the machine's confusion less obvious.

The Answer: No. Adding noise actually made the problem worse. Because the noise is stronger at certain times (redshifts), it forced the machine to rely on the parts of the data where the two methods disagreed the most. It's like trying to hear a whisper in a noisy room; if the noise covers up the clear parts of the message, you are left guessing based on the parts that sound different anyway.

The "Model-Independent" Test

The authors also asked: "Even if we can't get the specific ingredients (like how efficient the stars are) right, can we at least get the big picture right? Like, when did the cake finish baking?"

They looked at the duration (how long reionization took) and the midpoint (when it was half-done).

Midpoint: The machine was okay at guessing when the cake was half-done.
Duration: The machine failed here. It guessed the baking time was much shorter than it actually was.

The Conclusion

The paper concludes that the tool we use to study the early universe (21CMMC) is heavily dependent on the specific computer code (21cmFAST) used to generate the data.

If the real universe behaves slightly differently than the specific "baking method" the computer uses, the tool will give us the wrong answers about the ingredients of the universe. It's not just a small error; it's a fundamental confusion between the shape of the data and the physics behind it.

The Takeaway: Before we can trust these tools to tell us the true history of the universe, we need to make sure the tools aren't just memorizing the quirks of their own computer code. We need to find ways to analyze the data that don't depend so heavily on one specific simulation method.

Technical Summary: Exploring the Model Dependence of MCMC-Based 21 cm Power Spectrum Parameter Constraints

Problem Statement
The detection and analysis of the cosmic 21 cm signal from the Epoch of Reionization (EoR) rely heavily on the seminumerical simulation code 21cmFAST and its Markov Chain Monte Carlo (MCMC) sampler, 21CMMC. While 21CMMC has been used to constrain astrophysical parameters using upper limits from observatories like LOFAR, MWA, and HERA, the analysis assumes that the underlying seminumerical model accurately captures the true astrophysics of the EoR. However, 21cmFAST contains internal modeling choices, specifically two distinct "bubble-finding" algorithms for identifying ionized regions, which produce different topological structures and reionization histories despite using identical input parameters. The central problem addressed is whether 21CMMC can robustly recover astrophysical parameters when the model used for sampling differs from the model used to generate the data (an "out-of-domain" scenario), a situation that mirrors the reality of analyzing real observational data where the true physics is unknown.

Methodology
The authors generated a dataset of 100 pairs of 21 cm light-cones using 21cmFAST (version 3.3.2). Each pair shared identical cosmological parameters, density fields, and astrophysical inputs but differed solely in the bubble-finding algorithm used:

Algorithm 1: Flags all pixels within a region satisfying ionization criteria, allowing overlapping ionized bubbles. This is computationally expensive ( $O(N^2)$ ) and produces faster reionization.
Algorithm 2: Flags only the central pixel of a region, preventing overlap. This is the default, computationally efficient algorithm ( $O(N)$ ).

The study varied two key astrophysical parameters across the dataset: ionizing efficiency ( $\zeta$ ) and minimum virial temperature ( $T_{vir}^{min}$ ).

The authors then performed 21CMMC analyses in four configurations:

In-domain (1 $\to$ 1 and 2 $\to$ 2): Sampling Algorithm 1 data with an Algorithm 1 sampler, and Algorithm 2 data with an Algorithm 2 sampler.
Out-of-domain (1 $\to$ 2 and 2 $\to$ 1): Sampling Algorithm 1 data with an Algorithm 2 sampler, and vice versa.

The analysis focused on the recovery of $\zeta$ and $T_{vir}^{min}$ , as well as derived model-independent parameters: the duration ( $\Delta z$ ) and midpoint ( $z_{50}$ ) of reionization. Follow-up tests included introducing instrumental noise (simulated via 21cmSense for HERA) and varying parameter ranges to investigate degeneracies.

Key Results

Parameter Recovery ( $\zeta$ and $T_{vir}^{min}$ ): 21CMMC performed well in in-domain runs, recovering parameters close to true values. However, in out-of-domain runs, performance degraded significantly.
- $\zeta$ Recovery: The sampler exhibited strong model dependence. When sampling Algorithm 1 data with Algorithm 2, $\zeta$ was categorically overestimated (often hitting the prior ceiling). Conversely, sampling Algorithm 2 data with Algorithm 1 led to systematic underestimation. The reduced chi-squared ( $\chi^2_\nu$ ) values for out-of-domain runs were orders of magnitude higher than in-domain runs.
- $T_{vir}^{min}$ Recovery: The dependence was weaker but still present. While Algorithm 2 data sampled with Algorithm 1 returned values close to the identity line, Algorithm 1 data sampled with Algorithm 2 showed significant scatter and bias. The authors note this apparent accuracy for $T_{vir}^{min}$ is likely an artifact of the specific parameterization chosen rather than a robust feature.
Degeneracy: The study identifies that the topological difference between the two bubble-finding algorithms (specifically the size and overlap of ionized bubbles) is degenerate with the ionizing efficiency parameter $\zeta$ . To match the faster reionization speed of Algorithm 1 data using the slower Algorithm 2 model, 21CMMC compensates by inflating $\zeta$ .
Model-Independent Parameters: Despite the failure to recover the underlying 21cmFAST parameters accurately, the study tested the recovery of the reionization duration and midpoint.
- Midpoint ( $z_{50}$ ): Recovered with reasonable accuracy in out-of-domain runs.
- Duration ( $\Delta z$ ): Showed a clear bias; 21CMMC failed to accurately return the duration of reionization in out-of-domain scenarios, underestimating it significantly.
Impact of Noise: Introducing realistic instrumental noise did not mitigate the model dependence. In fact, the performance gap between in-domain and out-of-domain runs widened with noise, as the noise profile (dominated by high-redshift uncertainty) forced the fits to rely on lower-redshift data where the model differences are most pronounced.

Significance and Claims
The paper concludes that 21CMMC's constraining power is highly sensitive to the agreement between the astrophysical model of the mock data and the model used for sampling. The authors claim that:

Model Dependence is Non-Trivial: The discrepancy is not merely a small statistical error but a systematic bias that prevents the accurate recovery of both specific model parameters ( $\zeta$ ) and nominally model-independent observables (reionization duration).
Current Constraints May Be Biased: Previous constraints on EoR astrophysics derived using 21CMMC may be inaccurate if the underlying simulation assumptions (such as the specific bubble-finding algorithm or other implicit modeling choices) do not perfectly match the true physics of the Universe.
Need for Robustness: The results motivate a shift toward model-independent analysis techniques or a deeper investigation into the algorithmic dependencies of seminumerical simulations before they can be trusted to interpret upcoming 21 cm data from current and future radio interferometers. The authors emphasize that while 21cmFAST is a useful tool, its specific implementation details (like the bubble-finding algorithm) introduce hidden uncertainties that must be quantified.

Exploring the Model Dependence of MCMC-Based 21 cm Power Spectrum Parameter Constraints