RL-Based Coverage Path Planning for Deformable Objects on 3D Surfaces

This paper proposes a reinforcement learning-based approach for coverage path planning on 3D surfaces using deformable objects, which leverages harmonic UV mapping and scaled grouped convolutions to process contact feedback in simulation and successfully demonstrates effective wiping tasks on a physical Kinova Gen3 manipulator.

Yuhang Zhang, Jinming Ma, Feng Wu

Published 2026-03-06
📖 5 min read🧠 Deep dive

Imagine you have a very complex, bumpy, and weirdly shaped object—like a human torso, a car door with a window cut out, or a crumpled piece of fabric. Your goal is to wipe this entire surface clean with a soft sponge.

If you were a human, you'd just look at the object, feel the bumps with your hand, and naturally figure out how to move the sponge to cover every inch without missing a spot or getting stuck in a hole.

Robots, however, are terrible at this. They usually see the world as a flat grid or a rigid box. If you ask a standard robot to wipe a curved, bumpy surface with a squishy sponge, it gets confused. The sponge stretches, the surface curves, and the robot doesn't know where it is or what it's touching.

This paper presents a clever new way to teach a robot how to be a master cleaner for these tricky jobs. Here is the breakdown of their solution, using some simple analogies:

1. The Problem: The "Flat Map" vs. The "Bumpy Ball"

Imagine trying to draw a map of the entire Earth on a flat piece of paper. You have to stretch and squish the continents to make them fit. If you try to navigate a robot using a 3D model of a bumpy surface, the math gets incredibly complicated. The robot has to calculate millions of points in 3D space to know where to move next. It's like trying to solve a puzzle while wearing thick gloves.

2. The Solution: The "Unfolding Trick" (Harmonic UV Mapping)

The authors' first big idea is to flatten the problem.

  • The Analogy: Think of a 3D object (like a basketball or a human arm) as a balloon. If you cut the balloon open and lay it flat on a table, it becomes a 2D shape.
  • The Tech: They use a mathematical trick called Harmonic UV Mapping. This takes the complex 3D surface the robot needs to clean and "unwraps" it onto a flat 2D square.
  • Why it helps: Instead of the robot trying to navigate a 3D maze, it now just has to draw a line on a flat piece of paper. It's much easier to plan a path on a flat map than on a bumpy ball.

3. The Brain: The "Smart Sponge" (Reinforcement Learning)

Once the surface is flattened, they need a brain to figure out the best path. They don't program the robot with strict rules (like "move left, then right"). Instead, they use Reinforcement Learning (RL).

  • The Analogy: Imagine a baby learning to walk. It falls down, gets up, tries again, and slowly learns what works.
  • The Process: The robot is placed in a virtual video game (a simulator called MuJoCo). It tries to wipe the "flat map" millions of times. Every time it covers a new spot, it gets a "point" (reward). Every time it wastes time or misses a spot, it gets a "penalty."
  • The Feature Extractor (SGCNN): To help the robot "see" the map, they use a special type of AI camera (SGCNN) that looks at the map like a human looks at a maze, spotting patterns and boundaries instantly.

4. The Action: From Paper Back to Reality

Once the robot learns the perfect path on the flat 2D map, the system "re-wraps" that path back onto the original 3D object.

  • The Result: The robot now knows exactly how to move its arm in 3D space to follow the path it learned on the flat map.
  • The "Squishy" Factor: Because the sponge is soft, the robot doesn't need to be perfect. If the 3D model was slightly wrong, the sponge just squishes a little to fill the gap, ensuring the surface still gets wiped clean.

5. The Results: Better Than the Old Ways

The researchers tested this on 10 different objects (bowls, car doors, human models).

  • Old Methods: Tried to use rigid rules (like a lawnmower going back and forth in straight lines). These often missed spots or took very long, winding paths.
  • Their Method: The AI learned to take a shorter, smoother path that covered more area. It was like comparing a clumsy person mowing a lawn in straight lines versus a professional gardener who intuitively knows exactly where to cut to get the job done fastest.

The Real-World Test

Finally, they took the robot out of the video game and put it in the real world. They used a real robotic arm (Kinova Gen3) to wipe the back of a human mannequin.

  • The Outcome: The robot successfully wiped the entire back, avoiding holes (like the armpits) and covering the curves perfectly.

Summary

In short, this paper teaches a robot how to clean weird, bumpy surfaces by:

  1. Flattening the 3D world into a 2D map (like unfolding a balloon).
  2. Training an AI in a video game to learn the best path on that map.
  3. Translating that path back to the real 3D world, letting the soft sponge handle the small imperfections.

It's a bridge between the messy, flexible real world and the rigid, mathematical world of robots, making them much better at tasks like cleaning, disinfecting, or massaging.