How libraries classified physics preprints before arXiv and set the stage for distinguishing insiders from outsiders

In this comment, historian and sociologist Phillip Roth examines the history of how libraries classified physics preprints prior to the establishment of arXiv, highlighting how these early systems helped distinguish between insiders and outsiders in the scientific community.

Original authors: Phillip H. Roth

Published 2026-03-30
📖 5 min read🧠 Deep dive

This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine the world of scientific research as a massive, chaotic library where new books (scientific papers) are being written every single day. Before the internet, if you wanted to know what was happening in a specific field like physics, you had to rely on a very slow, old-fashioned mail system.

This paper, written by Phillip Roth, tells the story of how a few clever librarians in the 1950s and 60s invented a system to organize this chaos. They didn't just sort books; they accidentally created a "VIP club" that separates the people who belong in the room (insiders) from everyone else (outsiders).

Here is the story broken down with some simple analogies:

1. The Problem: The "Lost in the Mail" Dilemma

Before the internet, scientists shared their latest discoveries by mailing handwritten notes to each other. It was like a private group chat. But if you were a scientist visiting a big lab (like CERN in Switzerland) for a few weeks, you might miss the mail coming to your home office. You'd be out of the loop, like someone who missed the group chat while on vacation.

The Librarian's Solution:
A librarian at CERN named Luisella realized this was a problem. She said, "Stop mailing these notes to people's homes. Send them to the library instead!"

  • The Analogy: Imagine a town square where everyone used to whisper secrets in private backyards. The librarian built a central bulletin board in the town square and said, "Put your notes here, and everyone can read them." This turned private whispers into public news.

2. The Sorting System: The "Bouncer" at the Door

Soon, so many notes started arriving that the library was drowning in paper. They needed a way to sort them. They couldn't just pile them up; they needed categories.

  • The Human Filter: They hired "scientific information officers" (people who knew both physics and library science). These people acted like bouncers at a club. They looked at every paper and decided: "Does this fit our club's vibe?"
  • The Categories: They created simple labels like "Theoretical Physics" or "Machine Building."
  • The Catch: These labels weren't neutral. They were designed to fit what the CERN scientists cared about. If your paper was about something CERN didn't care about, it might get lost in the shuffle or labeled in a way that made it hard to find.

3. The "Badge of Membership"

Here is where things get interesting. Once these library lists became famous, scientists realized that getting your paper on the list was a status symbol.

  • The Analogy: Think of it like getting your song played on the radio. If you are on the playlist, you are a "real" artist. If you aren't, you are just playing in your garage.
  • Scientists started racing to get their papers sorted and listed quickly. Being "categorized" by the library meant you were an insider. If you weren't on the list, you were effectively an outsider.

4. The Hidden Bias: "Mainstream" vs. "Weird"

The paper argues that these sorting systems weren't fair.

  • The Keyword Game: At another lab (DESY in Germany), they used keywords to sort papers. If you wrote a paper on a popular topic, you got many keywords, making it easy to find. If you wrote about a weird, niche topic, you got few keywords, making you invisible.
  • The "General" Trap: Even today, on the famous preprint website arXiv, there is a category called "General Physics" (gen-ph). The author points out that this is actually a junk drawer. It's where papers go if they don't fit anywhere else. The system treats these papers as "uninteresting" to the experts. It's a digital way of saying, "This doesn't belong to our club."

5. The Modern Twist: Robots as Bouncers

Today, we use computers and AI to sort these papers instead of human librarians. The website arXiv uses algorithms to guess which category a paper belongs in.

  • The Illusion: The website says this is "empowering" and "transparent."
  • The Reality: The author argues that the robot is just following rules created by humans long ago. When the robot moves a paper to the "General" category, it is still enforcing the same old boundary: "You are not an insider."

The Big Takeaway

The paper concludes that the way we organize science isn't just about "finding information." It's a social tool.

  • The Metaphor: Classification systems are like fences. They look like helpful fences that keep the garden tidy, but they also decide who is allowed to walk through the gate and who is left standing outside.
  • The "correct" category isn't about scientific truth; it's about who holds the power to decide what is important. By sorting papers into "good" categories and "bad" categories, the system quietly decides who gets to be a famous physicist and who gets ignored.

In short: The librarians of the past built the first "VIP list" for scientists. Today, our computers keep that list going, deciding who is an "insider" and who is an "outsider" based on how well their work fits into the boxes we've built for them.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →