GDA-YOLO11: Amodal Instance Segmentation for Occlusion-Robust Robotic Fruit Harvesting

This paper introduces GDA-YOLO11, a novel amodal instance segmentation framework that significantly enhances occlusion-robust robotic fruit harvesting by inferring complete fruit shapes and accurately estimating picking points, achieving superior performance metrics and higher success rates under varying occlusion levels compared to existing models.

Caner Beldek, Emre Sariyildiz, Son Lam Phung, Gursel Alici

Published 2026-03-02
📖 4 min read☕ Coffee break read

Imagine a robot trying to pick an orange from a tree. The problem? The orange is hiding behind a bunch of leaves.

Most robots today are like people with very bad eyesight: if they can't see the whole orange, they get confused. They might think the orange isn't there, or they might try to grab just the tiny sliver of orange they can see, missing the fruit entirely or damaging the tree. This leads to wasted food and unhappy farmers.

This paper introduces a new "super-robot brain" called GDA-YOLO11 that solves this problem. Here is how it works, explained simply:

1. The "X-Ray Vision" Trick (Amodal Segmentation)

Imagine you are looking at a car parked behind a fence. You can only see the wheels and part of the roof. A normal camera just sees the parts it can touch.

GDA-YOLO11 is different. It uses a technique called Amodal Segmentation. Think of it as the robot having "X-ray vision" or a "mental imagination." Even if the robot only sees 30% of the fruit, it doesn't just guess; it mathematically "draws" the invisible 70% of the fruit in its mind. It knows exactly where the whole fruit is, even the parts hidden behind leaves.

2. How the Robot Learned to See Better

The researchers took a standard, fast robot brain (called YOLO11) and gave it three specific upgrades, like adding special tools to a Swiss Army knife:

  • The "Spotlight" (Global Attention Module): Imagine the robot is in a messy room full of leaves and fruit. It used to get distracted by every leaf. The new "Spotlight" module helps the robot ignore the noise (leaves) and focus intensely on the shape of the fruit, even if it's partially hidden.
  • The "Deep Dive" (Deepened Head): The robot's brain got a little deeper. Instead of just skimming the surface of the image, it now looks closer at the details. This helps it figure out exactly where the edge of the fruit is, even when it's blurry or cut off by a branch.
  • The "Strict Teacher" (Asymmetric Loss): When the robot was learning, it made mistakes. Usually, teachers punish mistakes equally. But this new "teacher" was stricter about one thing: missing the fruit. If the robot said, "I don't see the fruit," but the fruit was actually there (just hidden), the teacher gave it a big penalty. This forced the robot to be extra careful and try to find the fruit even when it was hard to see.

3. The "Safe Grab" Strategy

Once the robot "imagines" the whole fruit, it needs to grab it.

  • Finding the Sweet Spot: Instead of grabbing the edge (which might be a leaf), the robot calculates the "center of gravity" of the whole fruit, including the invisible parts. It's like finding the center of a balloon even if someone is holding a piece of paper in front of it.
  • The Approach: The robot moves its arm to a safe spot, then gently pushes forward to grab the fruit. Because it knows where the whole fruit is, it doesn't accidentally squeeze a leaf or miss the fruit.

4. The Results: Does it Work?

The researchers tested this in a lab with a fake tree and real oranges. They covered the oranges with leaves to different degrees:

  • No leaves: Both the old robot and the new robot did great.
  • A few leaves: Both did well.
  • Heavy leaf cover: This is where the magic happened.
    • The old robot (YOLO11) got confused and failed to pick about 80% of the heavily hidden fruits.
    • The new robot (GDA-YOLO11) was much better, successfully picking nearly double the amount of hidden fruit compared to the old one.

The Big Picture

Think of this technology as giving a robotic farmer a pair of glasses that let them see through the clutter of a garden. By teaching the robot to "imagine" the full shape of the fruit, they can harvest more food, waste less, and work in messy, real-world fields where leaves and branches are always in the way.

This is the first time this kind of "imagination" has been successfully tested on a real robot arm picking real fruit, paving the way for fully autonomous farms in the future.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →