SynDocDis: A Metadata-Driven Framework for Generating Synthetic Physician Discussions Using Large Language Models

SynDocDis is a novel, privacy-preserving framework that leverages large language models and structured metadata to generate high-quality, clinically accurate synthetic dialogues between physicians, effectively addressing data access limitations while demonstrating strong performance in medical education and decision support applications.

Beny Rubinstein, Sergio Matos

Published 2026-04-13
📖 4 min read☕ Coffee break read

Imagine you are trying to teach a robot how to be a doctor. You want the robot to learn how real doctors talk to each other when they are confused about a patient's case, debating treatments, and sharing their expertise.

The problem? Real doctor conversations are like top-secret vaults. They contain sensitive patient information, and strict privacy laws (like HIPAA in the US or GDPR in Europe) lock those vaults tight. Doctors are also often afraid to share their raw thoughts because they worry about being judged or sued if they make a mistake.

So, how do you train the robot without breaking the law or hurting anyone's feelings?

Enter SynDocDis, a new "recipe" created by researchers Beny Rubinstein and Sérgio Matos. Think of it as a high-tech ghostwriter that writes fake doctor conversations that sound 100% real but contain zero real secrets.

Here is how it works, broken down into simple analogies:

1. The "Skeleton" vs. The "Flesh"

Usually, to write a story, you need the whole body. But for privacy, you can't use the whole body.

  • The Old Way: Trying to use real patient records is like trying to build a house using the actual bricks from someone else's home. It's dangerous and illegal.
  • The SynDocDis Way: Instead of the whole house, the researchers take just the blueprint (the metadata). They strip away the patient's name, address, and face, leaving only the "bones" of the case: Patient is a 69-year-old male with a specific type of cancer. He had surgery. Now we need to decide on medication.
  • The Magic: They feed these "bones" into a powerful AI (a Large Language Model). The AI acts like a master sculptor, using the blueprint to grow new "flesh" (the conversation) around it. The result is a brand-new, synthetic conversation that never happened, but feels exactly like it did.

2. The "Director's Script" (CIDI Framework)

You can't just tell the AI, "Write a chat between doctors." It might sound robotic or make things up.
The researchers gave the AI a very specific script called CIDI (Context, Instructions, Details, Input).

  • The Analogy: Imagine a movie director talking to actors. Instead of saying, "Just act natural," the director says: "You are Dr. Smith, a grumpy but brilliant oncologist. You need to challenge Dr. Jones's idea about the chemotherapy. Use big medical words, but keep it clear. Oh, and make sure you cite a study from 2023."
  • The AI follows these strict instructions to ensure the fake conversation sounds like a real, high-level medical debate, complete with disagreements, clarifications, and expert advice.

3. The "Taste Test"

After the AI wrote 9 different fake medical debates (mostly about cancer and liver issues), the researchers didn't just trust the computer. They called in five real doctors to taste-test the food.

  • The Review: These doctors read the fake chats and rated them on a scale of 1 to 5.
  • The Verdict: The results were delicious!
    • Communication: The fake doctors sounded incredibly natural (4.4 out of 5). They used the right jargon, listened to each other, and argued politely.
    • Medical Accuracy: The advice given was mostly correct and relevant (4.1 out of 5).
    • Privacy: Not a single real patient was mentioned. The "ghost" conversations were safe to share.

Why Does This Matter?

Think of this framework as a safe training ground.

  • For AI: It gives robots a massive library of "practice games" to learn how to be better medical assistants without ever seeing a real patient's private data.
  • For Doctors: It creates a way to share knowledge and debate tricky cases without fear of legal trouble.
  • For Students: It provides realistic scenarios for medical students to learn how to think like a specialist.

The One Catch

The researchers admitted that sometimes the AI's "references" (the studies it cites) were a little old, and sometimes the fake doctors didn't argue with each other as much as real ones do. It's like a new actor who knows the lines perfectly but hasn't quite mastered the art of improvising a wild argument yet. But with more practice and better "scripts," this is expected to get even better.

In a nutshell: SynDocDis is a privacy-safe magic trick. It takes the essence of real medical problems and uses AI to conjure up realistic, safe, and educational conversations that help us build better medical tools for the future.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →