A Consensus-Driven Multi-LLM Pipeline for Missing-Person Investigations

This paper introduces Guardian, a consensus-driven, multi-LLM pipeline enhanced by QLoRA fine-tuning that coordinates specialized models and a consensus engine to perform auditable, structured information extraction for critical missing-person investigations while avoiding unconstrained decision-making.

Joshua Castillo, Ravi Mukkamala

Published Wed, 11 Ma
📖 4 min read☕ Coffee break read

Imagine you are trying to solve a mystery, like finding a lost child in a vast forest. The first 72 hours are the most critical, but the information you have is messy: a blurry photo, a vague witness statement, a weather report, and a map. You need to turn this chaos into a clear plan to save the day.

This paper introduces Guardian, a smart computer system designed to help police and rescue teams do exactly that. Instead of relying on just one super-smart computer brain, Guardian uses a "team of brains" approach to make sure the answers are right.

Here is how it works, broken down into simple concepts:

1. The Problem: One Brain Can Be Wrong

Usually, when we use AI, we ask one big model to read a messy police report and tell us where to look. But AI can sometimes "hallucinate" (make things up) or get confused by messy handwriting or bad grammar. In a missing-person case, if the AI guesses the wrong location, it wastes precious time and resources.

2. The Solution: The "Panel of Experts"

Guardian doesn't trust just one AI. Instead, it acts like a jury.

  • The Witnesses (Multiple Models): The system asks several different AI models (like Qwen and Llama) to read the same report and give their best guess.
  • The Judge (The Consensus Engine): A special "referee" AI looks at all the different answers. If two experts say the child was seen near a park, but one says "the beach," the referee checks the evidence. It doesn't just pick a random answer; it looks for agreement.
  • The Safety Net: If an AI makes a mistake (like writing a messy list instead of a clean one), the system has a "repair crew" that fixes the formatting before anyone sees it.

3. The Process: A Factory Assembly Line

Think of the system as a highly organized factory:

  • Stage 1: The Intake (The Parser): Raw, messy documents (PDFs, notes, texts) are dumped onto a conveyor belt. The system cleans them up, checks them for errors, and organizes them.
  • Stage 2: The Assembly (The Core): This is where the "jury" works.
    • Summarizers write short, clear bullet points of what happened.
    • Extractors pull out specific facts (like "last seen at 5 PM" or "wearing a red hat").
    • Labelers guess the risks (e.g., "Is the child likely to run far?").
  • The Consensus Check: Before the final report goes to the human detective, the "Judge" compares all the answers. If the models disagree, the Judge steps in to resolve the conflict, ensuring the final answer is based on facts, not guesses.

4. The "Training" (QLoRA)

To make the experts better, the system gives them a special, efficient training course called QLoRA.

  • Analogy: Imagine you have a very smart student (the AI). Instead of rewriting their entire textbook (which takes forever and costs a lot of money), you just give them a few sticky notes with specific rules for this job.
  • This makes the AI much better at reading police reports without needing a super-expensive computer to run it.

5. The "Zone QA" (The Map Check)

Once the system suggests a search area (a "zone"), a special safety module checks it.

  • Analogy: It's like a quality control inspector on a map. If the AI suggests searching a huge, impossible-to-reach mountain range, the inspector says, "Wait, that's too big and risky. Let's narrow it down." It ensures the search plan is realistic and safe.

Why This Matters

The most important part of this paper is the philosophy: Reliability is a team sport.

In the past, we hoped one AI would be perfect. Guardian admits that AI makes mistakes. So, instead of hoping, it builds a system where mistakes are caught by the team before they reach the human investigators.

  • No "Black Boxes": Every step is recorded. If the system makes a decision, you can look back and see exactly which AI said what and why the "Judge" picked that answer.
  • Conservative & Safe: If the AI isn't sure, it admits it. It won't guess wildly. It prefers to say "we don't know for sure" rather than give a confident but wrong answer.

The Bottom Line

Guardian is a safety-first AI pipeline for finding missing people. It treats AI models like fallible experts, uses a "jury" to agree on the facts, fixes errors automatically, and ensures that the final search plan is built on solid, auditable evidence rather than a lucky guess. It turns messy, scary uncertainty into a clear, actionable plan for the people who need to save lives.