📄 health informatics

Machine Learning and Explainable AI for Multi-State Classification of Malaria Transmission Dynamics in Kenya

This study develops and validates an interpretable machine learning framework using Extreme Gradient Boosting to accurately classify malaria transmission states across Kenya's 47 counties from 2015 to 2025, demonstrating that integrating epidemiological and environmental data can effectively support targeted surveillance and resource allocation.

Original authors: Gogo, J. A., Wanyonyi, M.

Published 2026-05-12

📖 4 min read☕ Coffee break read

CC BY 4.0

Original authors: Gogo, J. A., Wanyonyi, M.

Original paper licensed under CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/). ⚕️ This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine malaria transmission in Kenya not as a smooth, flowing river, but as a weather system that shifts between four distinct "seasons": Low, Moderate, High, and Very High danger.

This paper is like a team of meteorologists trying to build a super-accurate forecast machine. Instead of just guessing the temperature, they want to predict exactly which "season" of malaria risk a specific county will be in next month.

Here is the story of how they built this machine, explained simply:

1. The Goal: Sorting the Weather

The researchers wanted to move away from complex, confusing numbers and instead sort every month in every one of Kenya's 47 counties into one of those four clear buckets.

Bucket 0: Low risk (The calm season).
Bucket 1: Moderate risk (A bit of rain).
Bucket 2: High risk (A storm is brewing).
Bucket 3: Very High risk (A hurricane).

Why do this? Because health officials need clear instructions. Knowing it's a "Category 3 storm" tells them exactly what to do, whereas just knowing "it's going to rain a lot" is harder to act on.

2. The Ingredients: What the Machine Ate

To make these predictions, the team fed their computer a massive "smoothie" of data from 2015 to 2025. The main ingredients were:

The Past: What happened last month and the month before (malaria cases don't just appear out of nowhere; they have a memory).
The Environment: How much rain fell, how green the plants were (vegetation), and the temperature.
The Shield: How many people were using mosquito nets (Insecticide-Treated Nets).

3. The Contest: Four Different Forecasters

The researchers didn't just pick one way to guess; they held a competition between four different "forecasters" (machine learning models) to see who was best:

The Linear Thinker (Logistic Regression): Good at simple, straight-line logic, but struggled with the messy, complex reality of nature.
The Committee (Random Forest): A group of decision trees voting together. Very strong, but not quite the champion.
The Perfectionist (Extreme Gradient Boosting - XGBoost): This model learned by making mistakes and correcting them over and over again, step-by-step. It won the competition.
The Strict Rule-Follower (Support Vector Machine): Tried to draw rigid lines between categories but got confused by the complex data and performed poorly.

4. The Champion's Scorecard

The winner, Extreme Gradient Boosting, was incredibly accurate.

Accuracy: It got the right "season" almost 99% of the time.
Reliability: It didn't just guess; it gave a confidence score (probability) that was trustworthy. If it said there was a 90% chance of a "High Risk" month, it was right 90% of the time.
Speed: It was also the fastest to train and run, making it practical for real-world use.

5. The "Why" (Explainable AI)

Usually, powerful computers are "black boxes"—you put data in, and a result comes out, but you don't know why. The researchers used special tools (like SHAP and LIME) to open the box and peek inside. They found:

The Past is King: The single biggest predictor of next month's risk was simply what happened last month. Malaria has a strong "memory."
Nature's Role: Rain and green vegetation were strong drivers (mosquitoes love wet, green places).
The Shield Works: Higher coverage of mosquito nets reliably lowered the risk.

They also checked if the model was "overconfident" (like a weatherman who always predicts rain even when it's sunny). They found the champion model was well-calibrated, meaning its confidence levels matched reality.

6. The Catch and The Future

The authors are honest about the limitations:

The "Memory" Trick: Because the model relies heavily on what happened last month, it works incredibly well for places where malaria patterns are stable. However, if the rules of the game change suddenly (like a new disease variant or a massive climate shift), the model might need to relearn.
Data Gaps: They didn't have data on everything (like exactly how many mosquitoes were biting or specific local economic factors), so the model is missing a few puzzle pieces.
Local Flavor: This was built specifically for Kenya. It might need adjustments to work in other countries with different landscapes.

The Bottom Line

This paper proves that we can use smart computer algorithms to sort malaria risk into clear, actionable categories. By using a "champion" model that learns from the past, rain, and mosquito nets, health officials can get a reliable "weather forecast" for malaria. This helps them know exactly when and where to send their resources, rather than guessing in the dark.

Technical Summary: Machine Learning and Explainable AI for Multi-State Classification of Malaria Transmission Dynamics in Kenya

Problem Statement
Malaria remains a critical public health challenge in sub-Saharan Africa, characterized by significant spatial and temporal heterogeneity in transmission intensity. While traditional modeling approaches (e.g., compartmental models, statistical time series) have provided insights, they often rely on restrictive assumptions such as linearity and stationarity, limiting their ability to capture complex, nonlinear interactions among climatic, environmental, and intervention-related factors. Furthermore, existing machine learning studies in malaria research frequently focus on continuous outcomes (incidence or prevalence) rather than discrete, operationally relevant risk categories used in public health decision-making. There is also a noted gap in the rigorous assessment of probabilistic calibration and the integration of explainable artificial intelligence (XAI) to ensure model transparency and practical adoption in resource-constrained settings.

Methodology
This study employs a quantitative longitudinal design using a balanced panel dataset comprising monthly observations from all 47 counties in Kenya from January 2015 to December 2025 (6,204 county-month observations).

Data Sources: Malaria incidence data were sourced from the Kenya Ministry of Health's District Health Information System 2 (DHIS2) and Malaria Indicator Surveys. Environmental variables (temperature, precipitation, Normalised Difference Vegetation Index) were obtained from the Climate Hazards Group InfraRed Precipitation with Station data. Intervention data (insecticide-treated net coverage) and static geographical variables (elevation, population density) were derived from survey records and the Kenya National Bureau of Statistics.
Target Variable: The outcome is a categorical transmission state ( $S_{i,t} \in \{0, 1, 2, 3\}$ ) derived from malaria incidence per 1,000 population, categorized as: Low (<5), Moderate (5–19), High (20–99), and Very High (≥100).
Feature Engineering: To capture temporal dependence, the study constructed lagged features for covariates (1 and 2 months) and lagged transmission states. The final feature vector included contemporaneous and lagged environmental, intervention, and demographic variables.
Models Evaluated: Four supervised learning algorithms were implemented: Multinomial Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Support Vector Machine (SVM).
Validation Strategy: A forward chaining validation scheme was used to preserve temporal structure, dividing data into a training period (2015–2020) and a test period (2021–2025). Hyperparameters were tuned via time-ordered cross-validation within the training set.
Evaluation Metrics: Performance was assessed using Accuracy, Macro-averaged Precision, Recall, F1-score, Matthews Correlation Coefficient (MCC), Area Under the Curve (AUC), and Brier Score. Calibration was evaluated using reliability diagrams.
Explainability: The best-performing model was analyzed using SHapley Additive exPlanations (SHAP) for global feature importance, Partial Dependence Plots (PDP) for marginal effects, and Local Interpretable Model-agnostic Explanations (LIME) for local instance interpretation.

Key Results

Model Performance: Extreme Gradient Boosting (XGBoost) achieved superior performance across all metrics, with an accuracy of 0.9918, a macro-averaged F1 score of 0.9647, an MCC of 0.9831, and the lowest Brier score (0.0031), indicating highly reliable probability estimates. Random Forest also performed strongly (Accuracy: 0.9869), while Multinomial Logistic Regression showed moderate performance. The Support Vector Machine exhibited the lowest performance (Accuracy: 0.6792) and poor calibration.
Calibration: XGBoost demonstrated strong calibration, with reliability curves closely aligned to the diagonal, whereas Logistic Regression and SVM showed systematic deviations.
Feature Importance: SHAP analysis identified lagged malaria incidence (1-month lag) as the most influential predictor, followed by environmental variables (NDVI and precipitation) and insecticide-treated net (ITN) coverage. Lagged incidence showed a strong positive association with higher transmission states, while ITN coverage showed a negative association.
Temporal Dynamics: Partial dependence analysis revealed nonlinear relationships and clear seasonal patterns, with transmission probabilities peaking during rainy seasons and varying with temperature thresholds.
Computational Efficiency: XGBoost required the shortest training time (0.6363 seconds) and maintained low inference latency, making it suitable for routine surveillance systems.

Significance and Claims
The authors claim that this study provides a robust, interpretable, and scalable framework for modeling malaria transmission dynamics that directly aligns with operational decision-making frameworks. The primary contributions are:

Operational Relevance: By modeling transmission as discrete states rather than continuous values, the framework directly supports actionable risk categories used in malaria control programs.
Rigorous Evaluation: The study emphasizes the importance of probabilistic calibration alongside predictive accuracy, ensuring that risk estimates are reliable for resource allocation.
Transparency: The integration of XAI methods (SHAP, PDP, LIME) enhances model interpretability, identifying key drivers (lagged incidence, climate, interventions) and facilitating trust among public health practitioners.
Practical Deployment: The high performance and low computational cost of the XGBoost model suggest its feasibility for integration into real-time early warning systems and surveillance platforms in Kenya.

The paper concludes that while the high predictive performance is partly driven by the temporal persistence of malaria transmission (captured by lagged variables), the framework offers a practical tool for strengthening surveillance and evidence-based intervention strategies. The authors note that further validation in different epidemiological settings is necessary to assess generalizability.