Can AI be Easy? Lessons Learned from the EZR.py Toolkit

The paper argues that reading and refactoring code to create a minimal, unified Python toolkit (EZR.py) reveals that simple, lightweight algorithms can outperform complex state-of-the-art tools in tabular software engineering optimization tasks while requiring significantly less data and computational resources.

Original authors: Tim Menzies, Srinath Srinivasan

Published 2026-06-03✓ Author reviewed
📖 7 min read🧠 Deep dive

Original authors: Tim Menzies, Srinath Srinivasan

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

The Big Idea: Do We Really Need Giant AI Machines?

Imagine the current trend in Artificial Intelligence is like building a massive, high-tech skyscraper to solve a simple problem, like finding a lost key in a garden. Everyone says, "You need a billion-dollar crane, a team of 50 engineers, and a supercomputer to find that key."

The authors of this paper say: "Wait a minute. You don't need a skyscraper. You just need a flashlight and a map."

⚠️ Important Scope Note:
This paper is not about all of Artificial Intelligence. It focuses specifically on one corner of the field: Tabular Software Engineering problems. This means tasks involving tables of numbers and specific goals, such as optimization, classification, prediction, regression, and basic text mining.

What this does NOT cover: It does not address Generative AI tasks (like ChatGPT or LLMs that generate new code, stories, or images). The authors have not tackled those generative tasks yet; applying these lessons to them is future work they hope to do. The claim here is that for tabular tasks, we are overcomplicating things.

They argue that for a huge chunk of software engineering problems (specifically those involving tables of numbers and goals), we are overcomplicating things. They built a tiny toolkit called EZR (only 400 lines of code) that does the job of massive, heavy software libraries, but it runs 500 times faster and needs almost no data to learn.

The Toolkit: A Swiss Army Knife vs. A Warehouse

Most modern AI tools are like a warehouse full of specialized tools: a giant saw for wood, a heavy drill for metal, a complex laser for glass. You have to buy the whole warehouse (installing huge libraries like pandas and sklearn) just to use one tool.

EZR is a Swiss Army Knife.
The authors realized that if you look closely at how these different tools work for tabular data, they are actually doing the same basic things. They stripped away the fancy packaging and found that:

  • Classification (sorting things into groups)
  • Clustering (finding natural groups)
  • Optimization (finding the best solution)
  • Text Mining (finding relevant documents)

...all rely on the same three simple building blocks:

  1. Num: A bucket that counts numbers and averages them.
  2. Sym: A bucket that counts symbols (like words or categories).
  3. Data: A box that holds rows of information.

Instead of building a new engine for every task, EZR uses these same buckets to do everything. It's like realizing that a spoon, a fork, and a knife are all just handles with a specific shape at the end; you don't need three different factories to make them.

The Six Surprising Discoveries

The paper tested this tiny toolkit on 120+ real-world software problems. Here is what they found, using simple metaphors:

1. The "Heavy" Myth

The Belief: To do AI, you need a massive computer and huge libraries.
The Reality: For tabular tasks, you can do it with a tiny script.
Analogy: It's like thinking you need a full orchestra to play a lullaby. The authors showed that a single violin (EZR) can play the same tune just as well, without needing the 50 other musicians (the heavy dependencies).

2. The "Separate Subjects" Myth

The Belief: Sorting data, grouping data, and finding patterns are totally different subjects that need different code.
The Reality: For tabular data, they are nearly identical under the hood.
Analogy: It's like thinking driving a car, driving a truck, and driving a bus are completely different skills. The authors showed that once you strip away the size of the vehicle, the steering wheel and pedals are the same. They wrote 30 lines of code that handle all three tasks.

3. The "Tree" Myth

The Belief: Decision trees (like flowcharts for AI) for predicting numbers are totally different from those for predicting categories.
The Reality: They are the same tree; just the fruit is different.
Analogy: Imagine a tree that grows apples. If you want oranges, you don't need a new tree species; you just change the label on the branch. The authors showed that switching between predicting numbers and categories is a one-line change in the code.

4. The "Old vs. New" Myth

The Belief: Newer, complex search methods (Local Search with restarts) are always better than old, simple ones (Simulated Annealing from 1983).
The Reality: For optimization tasks, the old method is often just as good, or better.
Analogy: Imagine trying to find the lowest point in a foggy valley. The "new" method says, "If you get stuck, jump back to the start and try again!" The "old" method says, "If you get stuck, take a small, random step up to shake yourself loose." The authors found that the "shake loose" method (1983) worked just as well as the "jump back" method, but without the chaos of constantly restarting.

5. The "More Data" Myth

The Belief: You need thousands of labeled examples and thousands of features (variables) to build a good model.
The Reality: For tabular problems, you need very few labels and very few features.
Analogy: Imagine trying to guess the winner of a race. You might think you need to know the runner's height, weight, shoe size, diet, sleep schedule, and blood type (thousands of features). The authors found that knowing just two or three things (like "shoe size" and "sleep") was enough to predict the winner accurately. They also found that labeling just 50 examples was enough to train a model that usually requires thousands.

6. The "Text Mining" Myth

The Belief: To find relevant documents in a huge library, you need massive AI models (LLMs) with billions of parameters.
The Reality: For simple document retrieval, a simple math trick works better.
Analogy: Imagine looking for a specific needle in a haystack. The high-tech approach uses a giant magnet that weighs a ton. The authors used a simple "Complementary Bayes" trick (30 lines of code) that acts like a sharp needle. It found the relevant documents faster and with fewer mistakes than the giant magnet, and it exposed a flaw in how the giant magnet was being used.

The "Active Learning" Superpower

One of the coolest things EZR does is Active Learning.

  • Passive Learning: Imagine a student who reads 1,000 pages of a textbook to learn a concept.
  • Active Learning (EZR): Imagine a student who reads 10 pages, realizes what they don't understand, and asks the teacher only for those specific 10 pages.

EZR acts like that smart student. It looks at the data, figures out which few examples are the most confusing or important, and asks for labels on only those. This saves massive amounts of time and money because humans don't have to label thousands of boring, repetitive examples.

The Conclusion: Read the Code, Don't Just Trust the Hype

The paper's main message is a call to action for developers and researchers: Read the code.

The authors argue that we have stopped reading code and started blindly trusting "black box" AI tools. By actually reading the code of these tools, they realized that many of them are doing the same thing in different ways.

The Takeaway:
Before you buy a Ferrari to drive to the grocery store, try walking.

  • If you can solve your problem with a tiny, simple toolkit (like EZR) for tabular software engineering tasks, you save time, money, and energy.
  • If the simple toolkit doesn't work, then you know you genuinely need a complex solution.
  • But if you just assume you need the complex solution because "everyone else is doing it," you might be carrying a heavy backpack when you only needed a pocket knife.

The authors conclude that in the world of software engineering optimization, less is often more, and the best way to find the "less" is to carefully read and simplify the code we already have.

Final Note on Scope: These lessons are demonstrated specifically for tabular SE tasks. Whether these simple methods extend to the complex world of Generative AI (LLMs) is an open question and a goal for future work. The authors are not claiming to have solved all of AI, but rather that we have overcomplicated a very large and important slice of it.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →