A Bibliometric Review of Explainable AI in Diabetes Risk Prediction: Trends, Gaps, and Knowledge Graph Opportunities

This bibliometric review of 1,933 documents reveals a critical gap in combining machine learning, explainable AI, and knowledge graphs for Type 2 diabetes risk prediction, proposing a new framework to bridge statistical explanations with structured clinical pathways for improved clinical decision support.

Original authors: Van, T. A.

Published 2026-04-21
📖 5 min read🧠 Deep dive

Original authors: Van, T. A.

Original paper licensed under CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/). ⚕️ This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

The Big Picture: Predicting Diabetes with a "Black Box"

Imagine you are trying to predict who might get Type 2 Diabetes (a serious condition where the body struggles to manage sugar). Scientists have built powerful computer programs (Machine Learning) that are incredibly good at looking at a person's data—like their weight, age, and blood pressure—and saying, "Yes, this person is at high risk."

However, there's a problem. These computer programs are like super-smart but silent magicians. They pull a rabbit out of a hat (a prediction), but they won't tell you how they did it. They are "black boxes."

Doctors need to know why the computer made that prediction. Is it because of the weight? The diet? The genetics? Without an explanation, doctors are hesitant to trust the computer. This is where Explainable AI (XAI) comes in. It's like giving the magician a microphone to say, "I pulled the rabbit out because I saw a carrot in your pocket."

What This Paper Did: A Library Detective Story

The author of this paper, Thieu Anh Van, acted like a library detective. They went through two massive digital libraries (Scopus and PubMed) and looked at 2,048 research papers published between 2015 and 2026.

Their goal was to answer three main questions:

  1. How fast is this field growing? (Spoiler: It's exploding!)
  2. Who is doing the work? (China, the US, and India are leading the pack.)
  3. What are the tools they are using? (Mostly "SHAP" and "LIME" for explanations, and "XGBoost" for predictions.)

The Big Discovery: The Missing "Map"

Here is the most important finding, explained with an analogy:

Imagine the computer program is a GPS that tells you the fastest route to a hospital.

  • The Prediction (The GPS): "Turn left in 500 feet." (Accurate, but maybe you don't know why).
  • The Explanation (XAI): "Turn left because the road ahead is blocked." (Better, but still just a traffic report).
  • The Missing Piece (Knowledge Graphs): A detailed map of the city's history and geography. It explains why the road is blocked (a construction project started by the city council 10 years ago) and how it connects to the rest of the city's infrastructure.

The paper found that while everyone is building great GPS systems (Predictions) and giving traffic reports (XAI), almost no one is using the detailed city map (Knowledge Graphs).

  • The Stat: Out of 2,048 papers, 906 talked about "Explainable AI" (the traffic report).
  • The Gap: Only 17 papers talked about "Knowledge Graphs" (the city map).
  • The Ratio: That's a 53-to-1 gap. It's like having a room full of people talking about how to drive, but only one person talking about how to read a map.

Why Does This Gap Matter?

The author argues that just saying "BMI is the reason" (a statistical fact) isn't enough for a doctor. A doctor needs to understand the story of the disease.

  • Current AI: "Your BMI is high, so you have a 78% risk of diabetes."
  • The Missing "Map" AI: "Your BMI is high. In medical science, we know high BMI leads to 'insulin resistance' (your body ignoring sugar). This resistance, combined with your high blood pressure, creates a chain reaction that leads to diabetes. Here is the path your body is taking."

The paper suggests that adding a Knowledge Graph (a structured map of medical facts) to the AI would help doctors understand the chain of events, not just the final number. This makes the AI a better partner for doctors.

The Proposed Solution: A Three-Layer Cake

To fix this, the author proposes a new way to build these AI systems, like a three-layer cake:

  1. The Bottom Layer (Prediction): The computer guesses who is at risk (using standard tools like XGBoost).
  2. The Middle Layer (Explanation): The computer explains which numbers mattered most (using tools like SHAP).
  3. The Top Layer (The Knowledge Map): The computer connects those numbers to real medical stories. It says, "High BMI isn't just a number; it triggers a biological chain reaction that we know causes diabetes."

The "So What?" for Regular People

  • The Field is Growing Fast: Interest in this topic has skyrocketed since 2020.
  • The "Black Box" is Opening: We are getting better at asking computers why they make decisions.
  • The Next Frontier: The next big step isn't just making the computer smarter; it's making the computer tell a better story using medical knowledge.
  • A Warning: Most of the data used to train these computers comes from old, small studies (like a tiny sample of 768 people from 1988). We need to test these new ideas on huge, modern populations to make sure they work for everyone, not just a few people from the past.

In a Nutshell

This paper is a call to action. It says: "We have built amazing tools to predict diabetes and explain the math behind it. But we are missing the most important part: connecting the math to the real-world story of human biology. If we add a 'Medical Knowledge Map' to our AI, we can help doctors make better, safer decisions for patients."

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →