Texo: Formula Recognition within 20M Parameters

Imagine you have a giant, super-smart library of mathematical formulas. For a long time, the only way to read these formulas (turning a picture of an equation into computer code) was to hire a massive, expensive team of experts. These "experts" were huge computer models that required powerful, industrial-grade supercomputers to run. They were like a 10-ton truck trying to deliver a single letter; they could do the job, but they were slow, expensive, and impossible to fit in a regular garage (your home computer).

This paper introduces Texo, a new model that changes the game. Think of Texo not as a 10-ton truck, but as a sleek, high-speed electric scooter. It's tiny, incredibly efficient, and can zip through the same job just as well as the giant truck, but it fits right in your pocket.

Here is the story of how they built this "scooter" using three clever tricks:

1. The Problem: The "Over-Engineered" Dictionary

The old models (like the giant trucks) were built with a dictionary meant for the entire English language. They knew words like "elephant," "butterfly," and "democracy." But when you are reading math, you don't need to know about butterflies! You only need to know about symbols like +, ∫, √, and π.

Because these old models carried around a dictionary of 50,000 words, they were bloated and heavy. It was like trying to carry a whole encyclopedia in your backpack just to read a single math textbook.

The Texo Solution:
The authors realized that math has its own strict language. They created a specialized, mini-dictionary containing only the 687 most important math symbols.

The Analogy: Instead of carrying a library, Texo carries a pocket cheat sheet. By throwing out the 49,313 useless words, they shaved off 80% of the model's weight instantly.

2. The Transfer: "Stealing" the Brain, Not the Body

Usually, when you want to build a smart AI, you have to train it from scratch, which takes forever and needs a lot of data. Texo didn't start from zero. The team took a slightly smaller, pre-trained model (called PPFormulaNet-S) and performed a "brain transplant."

They kept the smart parts that knew how to look at an image and understand shapes, but they swapped out the heavy, clunky vocabulary for their new, tiny math dictionary.

The Analogy: Imagine taking a professional race car engine (the smart brain) and putting it into a lightweight, aerodynamic chassis (the new vocabulary). The engine knows how to drive fast, but now the whole car is light enough to fly.

3. The Result: Math in Your Browser

The final result is a model with only 20 million parameters (a tiny number in the AI world).

Speed: It is 7 times faster than the previous best open-source model.
Accessibility: Because it is so small, you don't need a supercomputer. You can run it right in your web browser.
Privacy: The authors built a website where you can upload a photo of a formula, and the math is processed entirely on your own computer. No data is sent to a server. It's like doing your homework in your own living room rather than sending it to a school office where someone else might read it.

Why Does This Matter?

In the past, if you wanted to turn a picture of a complex equation into editable text, you had to pay for expensive software or use slow, heavy tools.

Texo proves that you don't need a "giant" to do a "giant" job. By being smart about what the model learns (focusing only on math) and how it speaks (using a tiny, efficient dictionary), they achieved the same high quality as the giants, but with a fraction of the resources.

In short: Texo is the "Swiss Army Knife" of formula recognition. It's small, fits in your pocket, does everything the big tools do, and you can use it for free, right now, without worrying about your privacy.

Model	Parameters	CDM Score (Avg)	Inference Speed (vs. UniMERNet-T)
UniMERNet-T	107M	High (SOTA)	Baseline (1x)
PPFormulaNet-S	58M	Moderate	~3x faster (due to parallel decoding)
Texo	20M	Comparable/Higher	~7x faster than UniMERNet-T

Texo: Formula Recognition within 20M Parameters

1. The Problem: The "Over-Engineered" Dictionary

2. The Transfer: "Stealing" the Brain, Not the Body

3. The Result: Math in Your Browser

Why Does This Matter?

1. Problem Statement

2. Methodology

A. Model Architecture

B. Vocabulary Distillation and Transfer (Key Innovation)

C. In-Browser Deployment

3. Key Contributions

4. Experimental Results

5. Significance

Texo: Formula Recognition within 20M Parameters

1. The Problem: The "Over-Engineered" Dictionary

2. The Transfer: "Stealing" the Brain, Not the Body

3. The Result: Math in Your Browser

Why Does This Matter?

1. Problem Statement

2. Methodology

A. Model Architecture

B. Vocabulary Distillation and Transfer (Key Innovation)

C. In-Browser Deployment

3. Key Contributions

4. Experimental Results

5. Significance

More like this

Interpretable Tau-PET Synthesis from Multimodal T1-Weighted and FLAIR MRI Using Partial Information Decomposition Guided Disentangled Quantized Half-UNet

SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses

MultiModalPFN: Extending Prior-Data Fitted Networks for Multimodal Tabular Learning

"Don't Do That!": Guiding Embodied Systems through Large Language Model-based Constraint Generation

OpenGLT: A Comprehensive Benchmark of Graph Neural Networks for Graph-Level Tasks