Network Traffic Analysis with Process Mining: The UPSIDE Case Study

This paper proposes a process mining-based method that analyzes online gaming network traffic to unsupervisedly characterize device states as interpretable Petri nets and accurately classify different video games, demonstrating its effectiveness through the UPSIDE case study involving Clash Royale and Rocket League.

Francesco Vitale, Paolo Palmiero, Massimiliano Rak, Nicola Mazzocca

Published Tue, 10 Ma
📖 5 min read🧠 Deep dive

Imagine you are a detective trying to figure out what kind of party is happening in a massive, noisy warehouse just by listening to the conversations between people.

That is essentially what this paper does, but instead of a party, it's an online video game, and instead of people talking, it's computer data packets zipping back and forth.

Here is the breakdown of their work, explained simply:

The Problem: The "Noisy Warehouse"

Online games like Clash Royale and Rocket League generate a huge amount of data. It's like a chaotic crowd where everyone is shouting at once.

  • The Challenge: If you just look at the raw data, it's a mess. It's "interleaved" (mixed up) and "noisy" (full of static).
  • The Old Way: Usually, computers use "Deep Learning" (AI) to guess what game is being played. But this is like a black box. The computer says, "I think it's Rocket League," but it can't tell you why. It's not very trustworthy if you need to explain the decision to a human.

The Solution: The "Process Mining" Detective

The authors propose a new method called Process Mining. Think of this not as a black box, but as a cartographer. They want to draw a map of how the data moves.

They use a technique to turn the messy noise into a clear, readable flowchart (called a Petri Net). Imagine turning a chaotic crowd into a diagram showing: "First, the player sends a move, then the server says 'Got it,' then the player sends another move."

How They Did It (The 4-Step Recipe)

  1. Listening (Monitoring): They set up a "tap" on the network to record all the traffic from devices playing games during a big event called UPSIDE. They watched people playing Clash Royale (a strategy card game) and Rocket League (soccer with rocket-powered cars).
  2. Chunking (Feature Extraction): The data stream is too long to analyze all at once. So, they chopped it into small, manageable "windows" (like cutting a long movie into 5-second clips).
  3. Grouping (State Characterization): They used a smart clustering algorithm to group these clips.
    • Analogy: Imagine you have a pile of mixed-up socks. You sort them into piles: "Socks with holes," "Socks with stripes," and "Socks with holes and stripes."
    • In their case, they sorted the data into different "states" based on what the computer was doing (e.g., "Sending a big update," "Waiting for a reply," "Sending a quick ping").
  4. Mapping (Modeling): For each group of socks (state), they drew a flowchart (Petri Net) showing the rules of that specific behavior.
    • The Result: They ended up with a set of clear, interpretable maps. One map showed how Clash Royale talks to its server, and another showed how Rocket League talks to its server.

The Experiment: Can They Tell the Games Apart?

They tested their method to see if they could look at a piece of network traffic and say, "Ah, this is Clash Royale," or "This is Rocket League."

  • The Twist: They didn't just look at the data; they looked at the flow of the data.
  • The Findings:
    • Too much data at once (Long Windows): If they looked at too long a chunk of time, the maps became blurry and generic. It was like trying to describe a whole movie by looking at one single frame; you lose the details.
    • Too few groups (Few States): If they only had two piles of socks, the maps were too simple and couldn't tell the games apart.
    • The Sweet Spot: They found the perfect balance: 3-second windows and 3 different states. This allowed them to distinguish the games with 88% accuracy.

Why This Matters (The "Aha!" Moment)

The coolest part of this paper is Interpretability.

  • Old AI: "I think this is Rocket League." (You have to trust the robot).
  • This Method: "I think this is Rocket League because I see a specific pattern where the player sends a burst of small messages, and the server replies with a specific type of 'Okay' packet."

They actually drew a picture (a Petri Net) that showed exactly how Clash Royale works: The player sends a "Push" (PSH) flag to say "Send this data immediately, don't wait!" This is a specific behavior unique to that game, and their method found it automatically without anyone telling them to look for it.

The Takeaway

This paper proves that you don't need a "black box" AI to understand complex network traffic. By using Process Mining, you can:

  1. Clean up the noise in network data.
  2. Draw clear maps (Petri Nets) of how different games behave.
  3. Identify games accurately while being able to explain exactly why you made that choice.

It's like going from guessing the flavor of a soup by taking a blind sip, to actually seeing the recipe and saying, "Ah, I can taste the basil; this is definitely Italian soup!"