Generating Realistic, Protocol-Compliant Maritime Radio Dialogues using Self-Instruct and Low-Rank Adaptation

This paper addresses the scarcity of high-quality maritime radio data by introducing a compliance-aware Self-Instruct framework enhanced with LoRA fine-tuning and a 26-filter verification pipeline to generate realistic, SMCP-compliant VHF dialogues for AI-assisted safety systems.

Gürsel Akdeniz, Emin Cagatay Nakilcioglu

Published 2026-03-06
📖 4 min read☕ Coffee break read

Imagine the ocean is a giant, chaotic office building where thousands of ships are constantly talking to each other and to the shore using walkie-talkies. This is the world of maritime radio.

The problem? It's a noisy, stressful place. Sometimes the signal is fuzzy, sometimes people speak different languages or accents, and sometimes, in a panic, a captain might forget the exact "script" they are supposed to use. In this high-stakes environment, a misunderstanding isn't just an annoyance; it can lead to collisions, fires, or even sinking ships.

The authors of this paper wanted to build a smart AI assistant to help captains communicate clearly and follow the rules. But to teach an AI how to do this, you need a massive library of "perfect" examples of radio calls. The catch? Real radio recordings are secret, private, and hard to get. You can't just ask a captain for a recording of their emergency call.

So, the researchers decided to build a "fake" library that looks and feels exactly like the real thing. Here is how they did it, explained through a simple story:

1. The "Robot Apprentice" (The Base AI)

They started with a smart robot brain (a Large Language Model called Llama 3.1). Think of this robot as a very talented actor who has read every book in the world but has never been on a ship.

  • The Problem: If you ask this actor to play a captain in distress, they might say, "Help me, I'm sinking!" which sounds dramatic but breaks the strict rules of maritime radio. They might invent fake ship names or forget to say "Mayday" three times.
  • The Goal: We need to train this actor to be a realistic maritime professional who never breaks the rules.

2. The "Scriptwriter" (Self-Instruct)

Instead of hiring a human to write thousands of scripts (which is impossible due to privacy), they taught the robot to write its own scripts.

  • They gave the robot a few "seed" examples (like a starter pack of 100 perfect scripts).
  • Then, they told the robot: "Here are some examples. Now, imagine a new emergency (like a fire or a collision), invent a ship name, pick a location, and write a new script."
  • The robot generated thousands of new radio calls.

3. The "Strict Inspector" (The 26-Filter Pipeline)

This is the most important part. The robot is creative, but it's also prone to lying (hallucinating) and making mistakes.

  • Imagine a 26-point security checkpoint at an airport. Every single script the robot wrote had to pass through this checkpoint.
  • The Filters checked things like:
    • Did you say "Mayday" three times? (If no -> Reject)
    • Did you invent a fake ship ID number? (If yes -> Reject)
    • Is the ship actually on the ocean, or did you put it in a desert? (If desert -> Reject)
    • Did you repeat the same sentence 50 times? (If yes -> Reject)
  • Only the scripts that passed all 26 checks were kept. The bad ones were thrown in the trash. This ensured the "fake" library was 100% compliant with international safety rules.

4. The "Specialized Training" (LoRA)

Now they had a perfect library of fake-but-realistic radio calls. They used this library to give the robot a specialized "boot camp."

  • Instead of retraining the whole robot (which would take a supercomputer and years), they used a technique called LoRA (Low-Rank Adaptation).
  • The Analogy: Imagine the robot is a general doctor. You don't need to retrain them to become a heart surgeon from scratch. You just give them a specialized pocket guide (the LoRA adapter) that teaches them the specific rules of maritime emergencies.
  • This was fast, cheap, and efficient.

5. The Result: A "Digital Twin" of Reality

After training, they tested the robot.

  • Before training: The robot sounded like a confused tourist. It failed almost every time.
  • After training: The robot sounded like a seasoned captain. It followed the rules, used the right jargon, and created unique, logical stories about ship emergencies.

Why Does This Matter?

This "fake" library is now being released to the public. It allows scientists to build better Speech-to-Text systems that can understand noisy radio calls, and Decision Support systems that can warn captains if they are about to say something dangerous.

In short: The researchers built a "fake reality" so perfect that it can train AI to save lives in the real world, all without needing to steal any private data from real ships. They turned a robot actor into a maritime safety expert using a strict inspector and a specialized pocket guide.