Bi-AQUA: Bilateral Control-Based Imitation Learning for Underwater Robot Arms via Lighting-Aware Action Chunking with Transformers

Bi-AQUA is a novel bilateral control-based imitation learning framework for underwater robot arms that integrates transformer-based action chunking with explicit lighting modeling to achieve robust performance in challenging, variable illumination conditions.

Takeru Tsunoori, Masato Kobayashi, Yuki Uranishi

Published 2026-03-09
📖 5 min read🧠 Deep dive

Imagine trying to teach a robot arm to pick up a toy and put it in a box while you are both underwater. Now, imagine the water is murky, the light is flickering, and the colors are shifting from red to blue to green every few seconds. For a standard robot, this is a nightmare. It would get confused, think the toy is a different color, or miss the box entirely because its "eyes" can't handle the weird lighting.

This paper introduces Bi-AQUA, a new way to teach underwater robots that solves two big problems at once: bad lighting and lack of touch.

Here is the simple breakdown using some everyday analogies:

1. The Problem: The "Foggy Glasses" and "Blind Hands"

Most underwater robots today are like a person wearing foggy glasses trying to thread a needle.

  • The Lighting Issue: Underwater, light behaves strangely. It gets absorbed (making things look dark), scatters (making things look blurry), and changes color (making a red ball look black). Standard robot brains get confused because they expect the world to look the same as it did when they were learning.
  • The Touch Issue: Many robots only use cameras (vision). But in water, you often need to feel things to know if you've grabbed something or if you're pushing against a wall. Robots that only "see" are like a blindfolded person trying to open a drawer; they might bump into it but never know when it's actually closed.

2. The Solution: The "Master and Apprentice" with Super-Senses

The researchers built a system called Bi-AQUA (Bilateral Control-Based Imitation Learning). Think of it as a Master and Apprentice setup.

  • The Master (Leader): A human operator sits on a boat or in a dry room, holding a robot arm. They can see clearly and feel the water resistance.
  • The Apprentice (Follower): A robot arm is underwater. It tries to copy the Master's movements exactly.
  • The "Bilateral" Magic: This is the key. The Master doesn't just tell the Apprentice where to move; they also share force. If the Apprentice bumps into a rock, the Master feels a "push" in their hand. If the Master pushes hard, the Apprentice knows to push hard too. This gives the robot a sense of "touch" even though it's underwater.

3. The Secret Sauce: The "Lighting Translator"

The real breakthrough in this paper is how Bi-AQUA handles the changing lights.

Imagine you are learning to drive in a car. Usually, you learn in daylight. But what if you suddenly had to drive at night, then in a tunnel, then in a blizzard? You would crash.

Bi-AQUA solves this by giving the robot a "Lighting Translator" (a special AI brain component).

  • The Translator's Job: Before the robot decides how to move, it looks at the camera image and asks, "What kind of lighting is this? Is it red? Is it flickering?"
  • The "FiLM" Filter: Think of this like putting on different pairs of sunglasses. If the light is red, the robot automatically "tunes" its vision to understand that redness is normal. If the light is blue, it adjusts again. It doesn't just ignore the weird light; it uses the information about the light to make better decisions.
  • The "Token": The robot also carries a little "note" (a token) in its brain that says, "Hey, remember, the light is weird right now." This note travels through the robot's decision-making process, ensuring every step it takes accounts for the current lighting.

4. The Results: From "Clumsy" to "Pro"

The researchers tested this in a real water tank with tasks like:

  • Pick and Place: Grabbing a block and moving it.
  • Closing a Drawer: A long, tricky task requiring pushing and pulling.
  • Pulling a Peg: A very tight fit that requires precise force.

The Results were amazing:

  • Old Robots (without the Lighting Translator): They worked perfectly in white light but failed miserably (0% success) when the light turned blue, green, or started changing colors. They were like a person who can only drive in perfect sunny weather.
  • Bi-AQUA: It succeeded 100% of the time, even when the lights were changing every 2 seconds, or when the object was a weird color, or when bubbles were blocking the view. It was like a driver who could handle rain, snow, fog, and night driving without blinking.

The Big Takeaway

Bi-AQUA is like teaching a robot to be a diver who can see through the murk and feel the current. By combining the "sense of touch" from the human operator with a special brain that understands how underwater light works, the robot can finally do complex jobs underwater without getting confused by the dark, colorful, and shifting environment.

It's a huge step toward robots that can actually help us explore the ocean, fix underwater cables, or clean up pollution, rather than just crashing into things when the sun goes down.