scGRIP: a graph-based explainable AI framework for single-cell multi-omics Gene Regulatory Inference with Prior Knowledge

⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

The Big Picture: Decoding the Cell's "Instruction Manual"

Imagine a single cell in your body as a tiny, bustling city. Inside this city, there are millions of workers (proteins) and a massive library of blueprints (DNA). But here's the catch: the library is huge, and the city doesn't need to read every blueprint at once. It only needs to read the specific ones required for its current job.

Gene Regulatory Networks (GRNs) are the "foreman's instructions" that tell the city which blueprints to open and which to ignore.

Transcription Factors (TFs): The foremen or managers.
Regulatory Elements (REs): The switches or light buttons on the wall.
Target Genes (TGs): The actual blueprints or machines being turned on.

For a long time, scientists could only look at the "average" city. They took a bucket of millions of cells, mashed them together, and tried to guess the rules. But this is like trying to understand a specific traffic jam by looking at the average traffic of an entire country. You miss the unique details of individual cells.

New technology (single-cell sequencing) lets us look at one cell at a time. But the data is messy, noisy, and overwhelming. Existing tools to decode these instructions are either too slow, too confusing to understand, or they miss the "big picture" context.

Enter scGRIP.

What is scGRIP? (The "Smart City Planner")

scGRIP is a new computer program (an AI framework) designed to figure out exactly how individual cells decide which genes to turn on. It does this by combining three clever tricks:

1. The "City Map" (The Prior Knowledge Graph)

Imagine trying to navigate a new city without a map. You'd get lost. Existing AI tools often try to learn the city layout from scratch, which is hard and error-prone.

scGRIP starts with a pre-drawn map. It knows that certain foremen (TFs) usually stand near certain switches (REs) and that those switches usually control specific machines (TGs). It builds a Graph (a network of dots and lines) based on what scientists already know about biology.

Analogy: Instead of guessing where the grocery store is, scGRIP starts with a GPS map that says, "The store is usually 2 blocks north of the park." This gives the AI a head start.

2. The "Universal Translator" (Tokenization)

Cells speak two languages at once: the language of DNA accessibility (which switches are open) and the language of Gene Expression (which machines are running). These languages are different, and translating them is hard.

scGRIP uses a Shared Codebook (like a universal dictionary). It translates both the "switch" language and the "machine" language into the same set of codes.

Analogy: Imagine you have a team of people speaking French and another team speaking Japanese. Instead of hiring two separate translators, scGRIP gives everyone a single "Emoji Dictionary." Now, a "smiley face" means "happy" to both groups, allowing them to understand each other perfectly and work together.

3. The "Why?" Detective (Explainable AI)

Most AI models are "black boxes." They give you an answer (e.g., "This cell is sick"), but they don't tell you why. If a doctor can't explain the diagnosis, they can't trust it.

scGRIP uses a technique called GraphSHAP. It acts like a detective who interrogates the network. It asks: "If we remove this specific foreman, does the instruction change?" or "If we flip this switch, does the machine stop?"

Analogy: Imagine a Rube Goldberg machine. If you want to know which domino caused the final cup to tip over, you could remove dominoes one by one. scGRIP does this mathematically to pinpoint exactly which "switch" and "foreman" are responsible for a cell's behavior.

What Did They Discover? (The Alzheimer's Case Study)

To prove it works, the scientists used scGRIP to study Alzheimer's Disease (AD). They looked at brain cells from patients with AD and compared them to healthy brains.

The Old Way: Might say, "These brain cells are generally inflamed."
The scGRIP Way: It zoomed in on a specific type of immune cell in the brain called a Microglia. It found that in Alzheimer's patients, a specific manager named SPI1 was frantically flipping switches to turn on "Amyloid" (plaque) cleanup crews and "Immune" alarm systems.

It didn't just say "it's bad." It showed the exact chain of command:

Foreman SPI1 → Flips Switch A → Turns on Gene B (Amyloid response).

This level of detail helps scientists understand how the disease progresses, not just that it is progressing.

Why is This a Big Deal?

It's Fast and Scalable: It can handle massive datasets (thousands of cells) without crashing the computer, unlike older methods that get stuck in traffic.
It's Trustworthy: Because it uses the "Detective" method (GraphSHAP), scientists can see the logic behind the AI's conclusions.
It's Precise: It captures the unique personality of each cell, rather than just the average. This is crucial for finding rare cell types that might be the key to curing diseases.

The Bottom Line

scGRIP is like giving scientists a high-definition, annotated map of the cellular city, complete with a translator and a detective. It allows us to stop guessing how cells work and start reading their specific instruction manuals, one cell at a time. This brings us closer to understanding complex diseases like Alzheimer's and potentially finding better ways to treat them.

1. Problem Statement

Single-cell multi-omics technologies (specifically paired scRNA-seq and scATAC-seq) enable the reconstruction of Gene Regulatory Networks (GRNs) at cellular resolution, offering insights into cellular heterogeneity and disease mechanisms. However, existing methods face three critical limitations:

Lack of Interpretability: Many deep learning models act as "black boxes," making it difficult to attribute predictions to specific regulatory interactions (TF-RE or RE-TG).
Scalability Issues: Current frameworks often struggle with the high dimensionality and sparsity of single-cell data, or require computationally expensive training strategies (e.g., training one model per gene).
Inconsistent Integration: Methods often rely on heterogeneous external databases and post-hoc attribution tools rather than integrating prior biological knowledge directly into the model architecture.

2. Methodology: The scGRIP Framework

scGRIP is a Graph Variational Autoencoder (VAE) that integrates prior regulatory knowledge, foundation-model-inspired tokenization, and graph-based explainable AI (XAI). The architecture consists of three main components:

A. Prior-Guided Heterogeneous Graph Construction

scGRIP formalizes the regulatory landscape as a graph $G=(V, E)$ where:

Nodes ( $V$ ): Transcription Factors (TFs), Regulatory Elements (REs/peaks), and Target Genes (TGs).
Edges ( $E$ ):
- TF-RE: Derived from cisTarget databases (motif binding predictions).
- RE-TG: Derived from genomic proximity (distance to Transcription Start Sites, TSS).
This graph serves as a static topological scaffold incorporating prior biological knowledge.

B. Cell-Specific Dynamic Embeddings via Tokenization

To transform the static graph into a dynamic model capable of capturing individual cell states, scGRIP employs a tokenization strategy inspired by foundation models (e.g., xTrimoGene):

Static Embeddings: Uses node2vec to generate structural embeddings reflecting global network connectivity.
Dynamic Embeddings: Uses a shared codebook to tokenize single-cell chromatin accessibility and gene expression values.
Integration: Combines static structural features with cell-specific value embeddings. This allows the model to learn unique molecular states for each cell within the graph structure.

C. Explainable Inference via GraphSHAP

Instead of post-hoc analysis, scGRIP integrates GraphSHAP (specifically adapting GraphSVX) directly into the inference pipeline:

Mechanism: It treats neighboring nodes (TFs and REs) as contributors to a target node (TG). It samples coalitions of regulators, masks them, and fits a weighted linear surrogate model to estimate Shapley values.
Output: This yields edge attribution scores at the single-cell level, quantifying the specific contribution of each TF-RE and RE-TG interaction to the predicted gene expression.

D. Embedded Topic Model (ETM) Decoder

The decoder uses a linear structure to reconstruct gene expression and chromatin accessibility from latent topic distributions. This ensures the learned representations are interpretable as coherent biological programs (topics) linking TFs, REs, and TGs.

3. Key Contributions

Graph-Based Representation Learning: A unified framework that treats TFs, REs, and TGs as nodes in a heterogeneous graph, leveraging GraphSAGE for efficient neighborhood aggregation and scalability.
Explainable AI Integration: The novel application of GraphSHAP within a VAE to infer cell-specific regulatory edge weights, moving beyond static network inference to dynamic, context-aware attribution.
Tokenization for Multi-omics: Adapting foundation-model tokenization techniques to single-cell data, allowing the model to capture both immediate data-driven insights and broader network context simultaneously.

4. Results

The authors evaluated scGRIP on three multimodal datasets (PBMC, BMMC, and Human Cortex) and an Alzheimer's Disease (AD) dataset.

TF-RE Inference Accuracy:
- Compared against state-of-the-art methods (LINGER, GLUE) and Pearson correlation baselines.
- Performance: scGRIP achieved the highest average Area Under the Precision-Recall Curve (AUPRC) of 0.613 across 14 TFs and 3 cell types, outperforming GLUE (0.606) and LINGER (0.599).
- Validation: Predictions showed higher concordance with ChIP-seq peaks (ChIP-Atlas) and experimental accessibility patterns.
Cell-Type and Condition Classification:
- Using inferred GRN scores as features for a KNN classifier, scGRIP achieved superior AUROC in distinguishing cell types (e.g., Astrocytes, Microglia) compared to ablation variants (without SHAP) and LINGER.
- This demonstrates that SHAP-derived weights capture subtle, context-dependent regulatory dynamics essential for cell identity.
Alzheimer's Disease (AD) Case Study:
- Differential GRNs: Identified condition-specific regulatory shifts in microglia, including upregulation of immune signaling (neutrophil degranulation, interferon-gamma) and amyloid-related pathways.
- Key Findings: Successfully recovered known AD-associated interactions, such as TREM2–APOE and HCLS1–CD86.
- Validation: scGRIP-inferred RE-TG pairs showed higher hypergeometric enrichment with an independent AD validation dataset (ROSMAP) compared to LINGER, particularly in microglia and astrocytes.
Scalability and Imputation:
- Memory Efficiency: scGRIP scales efficiently with graph size, outperforming attention-based models (GAT) and matching GLUE's efficiency.
- Cross-Modality Imputation: Achieved high Pearson correlations (0.73 for ATAC2RNA, 0.57 for RNA2ATAC), outperforming BABEL and other baselines.

5. Significance

Biological Insight: scGRIP provides a mechanism to uncover cell-type-specific and disease-specific regulatory programs that are often masked by bulk or average-based analyses. It successfully identified coordinated activation of immune and amyloid pathways in AD microglia.
Interpretability: By integrating Shapley values directly into the model, it offers a principled, axiomatic explanation for regulatory predictions, bridging the gap between deep learning performance and biological interpretability.
Scalability: The framework is designed to handle large-scale single-cell multi-omics datasets efficiently, making it a practical tool for future large-scale atlases.
Future Directions: The authors propose extending the framework to incorporate 3D chromatin conformation data (Hi-C) and spatial transcriptomics to further refine regulatory element-gene linkages.

In summary, scGRIP represents a significant advancement in computational genomics by combining graph neural networks, foundation-model tokenization, and explainable AI to deliver high-resolution, interpretable, and scalable gene regulatory network inference.