Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

Imagine you have a very smart, well-read librarian named LLM (Large Language Model). This librarian has read almost every book in the world and can answer almost any question. However, sometimes, when the librarian is tired or trying to be too creative, they start making things up. They might tell you that the capital of Australia is "Sydney" (it's Canberra) or that a specific rare bird eats only "moon cheese."

In the past, if you wanted to check if the librarian was telling the truth, you had to send a runner to the External Library (the internet) to find a book that confirmed or denied the statement. This is called Retrieval-Based Fact-Checking.

The Problem:
Sending a runner takes time, costs money, and sometimes the runner comes back with the wrong book or no book at all. Also, this method ignores the fact that the librarian already knows the answer inside their head; they just need to be asked the right way to admit it.

The New Idea:
This paper proposes a new game: "Fact-Checking Without Retrieval."
Instead of sending a runner to the library, we ask the librarian to look deep inside their own brain (their internal "parametric knowledge") and tell us, "Do I actually know this is true, or am I just guessing?"

The Challenge: The Librarian's "Inner Voice" is Quiet

The researchers tried many ways to listen to the librarian's inner voice:

The Confidence Check: "How sure do you sound?" (Sometimes the librarian sounds very confident even when they are lying).
The Word Count: "Did you use too many weird words?" (Not always a good sign).
The "Gut Feeling" (Internal Representations): Looking at the electrical signals in the librarian's brain while they think about the sentence.

They tested 18 different methods on 9 different types of tricky questions (from obscure facts to long stories in different languages).

The Winner: INTRA (The "Brain Scan" Method)

The researchers found that the best way to catch a lie wasn't to ask the librarian how confident they felt, but to scan their brain activity while they thought about the sentence.

They created a new tool called INTRA.

The Analogy: Imagine the librarian's brain is a giant orchestra. When they tell the truth, the violins, drums, and flutes play in perfect harmony. When they lie, the music gets a little out of tune, even if the librarian tries to hide it.
How INTRA works: It doesn't just listen to one instrument (one layer of the brain). It listens to the middle section of the orchestra (the middle layers of the AI) and combines the signals from all the instruments to create a single "Truth Score."

Why This Matters (The Real-World Magic)

Speed: It's like checking your own memory instead of Googling it. It's instant.
No Internet Needed: You can fact-check in a cave, on a spaceship, or anywhere without Wi-Fi.
Better at the "Long Tail": If you ask about a super obscure fact (like "What is the name of the 3rd president of a tiny island nation?"), the old methods (sending runners) often fail because that info isn't on the first page of Google. But INTRA, by listening to the librarian's deep memory, is surprisingly good at catching lies about rare things.
Multilingual: It works in many languages, not just English.

The Bottom Line

The paper says: "Stop relying so much on external search engines to check AI lies. The AI actually knows the truth inside its own head. If we learn how to read its internal signals correctly (using our new tool, INTRA), we can catch lies faster, cheaper, and more accurately than before."

It's like teaching the librarian to be their own honest judge, rather than always needing a referee from the outside.

Here is a detailed technical summary of the paper "Leveraging LLM Parametric Knowledge for Fact Checking Without Retrieval."

1. Problem Statement

The paper addresses the limitations of current Retrieval-Augmented Generation (RAG) based fact-checking systems. While methods like FActScore and SAFE verify claims by retrieving external evidence, they suffer from:

Retrieval Errors: Performance is heavily dependent on the quality of retrieved documents; noisy or irrelevant evidence leads to false positives/negatives.
Latency & Scalability: Querying external databases for every claim introduces significant computational overhead.
Underutilization of Internal Knowledge: These methods largely ignore the factual knowledge already encoded within the Large Language Model's (LLM) parameters.
Context Dependency: They verify "faithfulness" to retrieved context rather than intrinsic "factual correctness."

The authors propose a new setting: Fact-Checking Without Retrieval. The goal is to determine the truthfulness of an atomic claim using only the LLM's internal parametric knowledge and hidden state representations, without access to external databases or the original generation prompt.

2. Methodology

A. Proposed Framework: INTRA

The authors introduce INTRA (Intrinsic Truthfulness Assessment), a method designed to exploit interactions between internal model representations. Unlike previous approaches that focus on specific layers or tokens, INTRA aggregates information across the model's depth.

Key Technical Components:

Token and Layer Selection: Instead of relying solely on the first/last token or a single layer, INTRA computes a sequence-level embedding for every layer $l$ by aggregating token-level hidden states ( $h_l$ ) using a learnable attention mechanism:
$h_l(y) = \sum_{i=1}^{N} \alpha_{l,i} h_l(y_i)$
where $\alpha_{l,i}$ are attention weights derived from a learnable parameter vector $\theta$ .
Layer-wise Truthfulness Score: A linear classifier is applied to the sequence embedding of each layer to produce a probability score $p_l(\text{Verified} | y)$ .
Aggregated Score: To combine information across layers, the authors train an L2 regression model on top of the layer-wise probabilities. Crucially, they apply quantile normalization to the probabilities before regression to standardize them across layers.
$\text{INTRA}(y) = \sum_{l \in L} \beta_l \cdot q(p_l(\text{Verified} | y)) + b$
Note: The method focuses on middle layers (e.g., layers 11–22 for Llama 3.1-8B), as ablation studies showed first and last layers are less effective.

B. Evaluation Benchmark

To rigorously test generalization, the authors constructed a comprehensive evaluation framework spanning 9 datasets and 3 models (Llama 3.1-8B, Ministral-8B, Phi-4-mini). The datasets test five dimensions:

Long-tail Knowledge: PopQA and Wild Hallucinations (AC-WH).
Claim Source Variation: Human-authored (AVeriTeC, X-Fact) vs. Model-generated (UHead, Common Claims).
Multilinguality: X-Fact (25 languages).
Long-form Generation: Claims extracted from extended text (AC-WH, UHead).
Cross-Model Claims: Claims generated by different LLMs (GPT-3, Mistral, Llama).

C. Baselines

The study evaluates 18 methods, categorized into:

Unsupervised: Uncertainty quantification (Perplexity, Entropy, Sequence Probability), Attention-based scores, and Verbalized scores (LLM self-evaluation).
Supervised: Linear probes on hidden states (SAPLMA, MM, MIND), Contrastive methods (CCS), and specialized heads (UHead, TAD).
Retrieval-Based: A baseline using Google Search snippets (Verb + RAG).

3. Key Results

Superiority of Internal Representations: Across all models and datasets, methods leveraging internal representations (supervised and INTRA) consistently outperformed logit-based uncertainty signals (like Perplexity or Sequence Probability).
INTRA Performance:
- Achieved State-of-the-Art (SoTA) average performance across all models.
- On Llama 3.1, INTRA achieved a 77.7 ROC-AUC (vs. 75.0 for the runner-up, Sheeps) and 73.1 PR-AUC.
- It demonstrated strong generalization, performing consistently well across long-tail, multilingual, and cross-model datasets where other methods failed.
Comparison with Retrieval:
- INTRA matches the ROC-AUC of the retrieval-based baseline (Verb + RAG) but surpasses it by ~3% in PR-AUC (precision in detecting hallucinations).
- Efficiency: INTRA requires approximately 20x less computational time than retrieval-based methods (0.06s vs. 0.95s per instance) and does not require external API calls.
Layer Analysis: Experiments confirmed that intermediate layers contain the most informative signals for truthfulness. Aggregating multiple middle layers yields significantly better results than using a single layer.
Long-tail Robustness: INTRA showed a 30% improvement over the second-best method on rare entities (long-tail knowledge), whereas uncertainty-based methods (like Perplexity) failed on these.

4. Key Contributions

New Task Setting: Formalized "Fact-Checking Without Retrieval," shifting focus from faithfulness to retrieved context to intrinsic factual correctness.
Comprehensive Benchmark: Introduced a large-scale evaluation framework covering 9 datasets, 3 models, and 5 generalization dimensions (long-tail, source, language, length, cross-model).
INTRA Method: Proposed a simple yet effective architecture that aggregates layer-wise internal representations, achieving SoTA performance and robust generalization.
Empirical Insights: Demonstrated that middle layers are critical for truthfulness detection and that retrieval-free methods can outperform retrieval-based ones in precision while being significantly faster.

5. Significance and Future Directions

Scalability: By removing the dependency on external retrieval, this approach enables real-time, low-latency fact-checking suitable for high-throughput applications.
Training Signals: The ability to detect hallucinations without external tools makes these detectors viable as reward models for Reinforcement Learning from Human Feedback (RLHF) or as direct components in the generation process (e.g., self-correction).
Intrinsic Capabilities: The work proves that LLMs possess rich, accessible factual signals within their parameters that can be harnessed without external grounding, challenging the notion that retrieval is always necessary for verification.

In conclusion, the paper establishes that leveraging internal model representations via methods like INTRA offers a robust, scalable, and highly accurate alternative to traditional retrieval-based fact-checking, particularly in scenarios requiring generalization across diverse domains and languages.

Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

The Challenge: The Librarian's "Inner Voice" is Quiet

The Winner: INTRA (The "Brain Scan" Method)

Why This Matters (The Real-World Magic)

The Bottom Line

1. Problem Statement

2. Methodology

A. Proposed Framework: INTRA

B. Evaluation Benchmark

C. Baselines

3. Key Results

4. Key Contributions

5. Significance and Future Directions

More like this

DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use

A Survey of Reasoning in Autonomous Driving Systems: Open Challenges and Emerging Paradigms

PACED: Distillation at the Frontier of Student Competence

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

Reversible Lifelong Model Editing via Semantic Routing-Based LoRA