Malicious Or Not: Adding Repository Context to Agent Skill Classification

This paper presents the largest empirical security analysis of the AI agent skill ecosystem, demonstrating that incorporating repository context significantly reduces false positive malicious classifications from 46.8% to 0.52% while uncovering new attack vectors such as the hijacking of skills from abandoned GitHub repositories.

Florian Holzbauer, David Schmidt, Gabriel Gegenhuber, Sebastian Schrittwieser, Johanna Ullrich

Published 2026-03-18
📖 5 min read🧠 Deep dive

Imagine you've just bought a brand new, super-smart robot assistant (like a digital butler) that can do anything for you: write code, check your emails, or manage your finances. But this robot is a bit limited out of the box. To make it truly useful, you need to give it "skills"—little digital tools or plugins, like adding a new app to your smartphone.

This paper is about a massive investigation into the safety of these robot skills.

Here is the story, broken down into simple parts:

1. The Problem: The "App Store" Panic

Imagine a world where people are selling these robot skills on different "App Stores." Recently, security guards (automated scanners) started checking these skills and screaming, "DANGER! MALICIOUS!"

In fact, some of these stores claimed that nearly half of all the skills were dangerous. It was like walking into a grocery store and being told that 50% of the apples are rotten. This caused a lot of panic. People started thinking, "Maybe I shouldn't use my robot assistant at all!"

2. The Investigation: The "Big Dig"

The researchers in this paper decided to investigate this panic. They didn't just look at the "app store" labels; they went on a massive scavenger hunt.

  • The Scale: They collected 238,180 unique skills from three different marketplaces and from GitHub (a giant warehouse where developers store their code).
  • The Old Way: They first looked at the skills in isolation, just like the security guards did. They asked, "Does this code look suspicious?"
  • The Result: The old way still flagged a huge number of skills as dangerous.

3. The Twist: Context is King

Here is where the paper gets clever. The researchers realized that looking at a skill in isolation is like judging a book by its cover without reading the story.

The Analogy:
Imagine you find a knife in a kitchen.

  • Scenario A: You find a knife lying alone in a dark alley. You think, "That's dangerous! Someone might get hurt!"
  • Scenario B: You find the same knife in a chef's hand inside a busy, well-lit restaurant kitchen. You think, "That's fine. It's a tool for cooking."

The security scanners were only looking at the knife in the dark alley. They didn't see the kitchen.

The researchers developed a new method: Repository-Aware Analysis. They looked at the "kitchen" (the GitHub repository) where the skill lived. They asked:

  • Is this skill part of a larger, trusted project?
  • Does the code around it make sense?
  • Is the developer an active, real person?

4. The Big Reveal: It Was Mostly a False Alarm

When they added this "context" to their analysis, the results changed dramatically.

  • Before: The scanners said ~46% of skills were bad.
  • After: With the "kitchen" context, they realized that 99.5% of those "bad" skills were actually just normal tools being misunderstood.
  • The Real Danger: Only 0.52% of the skills were actually in truly malicious environments.

It turns out the security guards were crying wolf so often that they were scaring everyone away from perfectly safe tools.

5. The Real Villains: The "Abandoned Houses"

While most skills were safe, the researchers did find some real new dangers that nobody knew about yet.

The "Hijacked House" Attack:
Imagine you buy a house, but the previous owner moves away and forgets to change the locks. A stranger walks in, changes the address on the mailbox, and starts selling you "fresh bread" (which is actually poison).

  • The researchers found that some skills point to abandoned GitHub repositories.
  • Because the original owners left, bad actors could take over those empty digital "houses," change the code, and hijack the skills.
  • They found 121 skills that were vulnerable to this. One of these "hijacked" skills had been installed over 1,000 times!

6. The Leaky Faucet

They also found a privacy leak. One of the marketplaces (ClawHub) was accidentally showing the private email addresses of the people who created the skills. It's like a store displaying the home addresses of all its customers on the front window.

Summary: What Should We Take Away?

  1. Don't Panic: The ecosystem isn't 50% dangerous. It's actually quite safe (99.5% benign) if you look at the whole picture.
  2. Look at the Whole Picture: You can't judge a tool just by its description. You need to see where it lives and who built it.
  3. Fix the Locks: We need to stop using "abandoned" digital houses for our tools. If a project is dead, the marketplace needs to move the skill to a new, safe home so hackers can't sneak in.

In short: The paper tells us that the "wolf" isn't as scary as we thought, but we do need to make sure we aren't leaving our digital front doors unlocked.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →