`pandemonium`: High Dimensional Analysis in Linked Spaces

The paper introduces `pandemonium`, an R package that facilitates high-dimensional analysis in linked spaces by combining cluster analysis with linked visualizations, such as non-linear dimension reduction and animated tours, to explore relationships between predictors and responses in complex datasets like neural network activations and multivariable physical models.

Original authors: Gabriel McCoy, German Valencia, Ursula Laa

Published 2026-05-29
📖 5 min read🧠 Deep dive

Original authors: Gabriel McCoy, German Valencia, Ursula Laa

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you are trying to solve a giant, complex puzzle where you have two different sets of clues. One set of clues describes what you put in (like ingredients in a recipe or settings on a machine), and the other set describes what comes out (like the taste of the cake or the machine's output).

The problem is that there are so many ingredients and so many possible tastes that it's impossible to see the pattern just by looking at a spreadsheet. You need a way to see how the ingredients together create specific tastes.

This is exactly what the pandemonium R package does. It's a digital "magic window" that helps researchers connect the dots between two high-dimensional worlds.

Here is how it works, using simple analogies:

1. The Two Rooms (Linked Spaces)

Think of your data as two separate rooms:

  • Room A (The Clustering Space): This is where you group things based on how similar they are. Imagine sorting a pile of mixed-up socks by color and pattern.
  • Room B (The Linked Space): This is where you look at the original details. Imagine looking at the same socks to see what fabric they are made of or where they were bought.

Usually, researchers look at Room A, then walk over to Room B and try to guess how they relate. pandemonium puts a giant, two-way mirror between the rooms. When you point at a group of socks in Room A, the mirror instantly highlights those exact same socks in Room B, showing you their fabric and origin.

2. The Magic Lens (Clustering)

The tool starts by organizing the data in Room A. It uses a method called hierarchical clustering, which is like folding a map. You can zoom out to see a few big regions (like continents) or zoom in to see tiny neighborhoods (like streets).

  • You can say, "Show me 3 big groups," or "Show me 10 small groups."
  • As you change the number of groups, the tool instantly updates the view in both rooms.

3. The Moving Camera (Tours and Projections)

Since the data has too many dimensions to draw on a flat piece of paper, the tool uses two special camera tricks to flatten the 3D (or 100D) world into a 2D screen:

  • The Non-Linear Lens (UMAP/t-SNE): This is like a funhouse mirror that squishes and stretches the data to show which points are naturally close to each other, even if they are far apart in the raw numbers.
  • The Animated Tour: This is like a drone flying through a cloud of data points. Instead of a static photo, you get a video that slowly rotates the cloud, letting you see hidden shapes and gaps that you would miss if you just looked at one angle.

4. The "Brush" (Interactive Selection)

This is the most powerful feature. Imagine you have a paintbrush.

  • You paint a specific cluster of points in the "drone video" (Room A).
  • Instantly, those same points light up in the "static map" (Room B).
  • This lets you ask questions like: "Why do all these points that look similar in the output (Room A) have such different temperatures and humidity levels in the input (Room B)?"

Real-World Examples from the Paper

The authors tested this tool on two very different problems to show how it works:

Example 1: The Bike Rental Machine (Machine Learning)

  • The Setup: They had a computer model that predicts how many bikes people will rent based on weather (temperature, wind, rain).
  • The Problem: They wanted to know which weather combinations make the model act strangely or predict well.
  • The Solution: They grouped the model's internal "thoughts" (activations) into clusters. Then, they used the mirror to look at the weather data for those groups. They discovered that specific combinations of temperature and humidity were the main drivers for separating the groups. They also checked the "mistakes" (residuals) the model made and saw that the model was actually doing a good job everywhere, with no weird blind spots.

Example 2: The Particle Physics Puzzle (Physics)

  • The Setup: Physicists have a complex model with 150 knobs (parameters) that they turn to match experimental data about subatomic particles.
  • The Problem: With 150 knobs, it's impossible to know which ones actually matter.
  • The Solution: They took a smaller set of 6 knobs and 16 measurements. They grouped the measurements that looked similar. Then, they looked at the "knobs" for those groups. The tool revealed that only two specific knobs (out of the six) were responsible for creating the distinct groups. The other four knobs didn't seem to change the outcome much.

Why This Matters

Before tools like pandemonium, figuring out these connections was like trying to find a needle in a haystack while wearing blindfold. You might guess, but you couldn't see the pattern.

This package doesn't just crunch numbers; it lets you explore. It allows you to:

  1. Group data by similarity.
  2. Instantly see what those groups look like in the original data.
  3. Rotate and zoom through the data to find hidden structures.

It is designed to be easy enough for a beginner to use with a mouse and screen, but flexible enough for experts to plug in their own custom math formulas. It turns a confusing mess of high-dimensional data into a clear, interactive story.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →