EngGPT2: Sovereign, Efficient and Open Intelligence

G. Ciarfaglia, A. Rosanova, S. Cipolla, J. Bartoli, A. Di Domenico, C. Fioroni, A. Fontana, M. R. Scoleri, M. I. Mone, D. Franchi, M. C. Del Gaudio, F. Picariello, M. Gabusi, S. Bonura, V. Morreale, I

Published 2026-03-18

📖 4 min read☕ Coffee break read

View on arXiv ↗PDF ↗

Imagine the world of Artificial Intelligence as a massive, high-stakes cooking competition. For years, the biggest chefs (like the US and China) have been building "Super-Kitchens" with unlimited ingredients and giant ovens to create the most complex dishes (AI models). They use trillions of ingredients (data tokens) and massive amounts of energy.

EngGPT2 is the story of a new, clever Italian team (Engineering Group) that decided to win not by having the biggest kitchen, but by being the most efficient, sovereign, and smart chef in the room.

Here is the story of EngGPT2, broken down into simple concepts:

1. The "Smart Kitchen" Design (The Architecture)

Most AI models are like a kitchen where every single chef tries to cook every single dish. If you have 16 billion chefs, that's a lot of work, even for a simple salad.

EngGPT2 uses a Mixture-of-Experts (MoE) design. Imagine a kitchen with 64 specialized chefs (experts), but for every dish you order, the head chef only wakes up 8 of them to do the work.

The Analogy: Instead of 16 billion chefs working on a sandwich, you only use 3 billion "active" chefs. The other 13 billion are resting, saving energy and money.
The Result: You get a delicious, high-quality meal (smart answers) using only a fraction of the electricity and time.

2. The Secret Recipe (The Data)

Big models usually eat everything: the internet, books, code, and random noise. They eat about 15 to 36 trillion "bites" of data.

EngGPT2's Diet: They ate a smaller, higher-quality meal of 2.5 trillion tokens.
The Special Ingredient: About 25% of their diet was Italian. While other models are mostly English speakers who know a little Italian, EngGPT2 was raised speaking Italian from day one. This makes it a "local expert" for European culture, laws, and language, ensuring it understands the nuances of the region it serves.

3. The Training Camp (The Process)

Training an AI is like sending a student through school. EngGPT2 went through four distinct grades:

Elementary (Pre-training): Learning to read and write in English and Italian by reading books, websites, and code.
High School (Long-Context): Learning to read a whole novel without forgetting the first page. They trained to handle very long documents (up to 32,000 words) so they can summarize a whole contract or a long report.
University (Mid-Training): Learning how to think logically. They practiced math, puzzles, and step-by-step reasoning to become a "thinker," not just a "talker."
Graduate School (Post-Training): Learning how to be polite, follow instructions, and act like a helpful assistant. They learned to say "I don't know" when appropriate and to follow safety rules (compliance with EU laws).

4. The "Turbo" Mode (Reasoning Styles)

One of the coolest features is how EngGPT2 "thinks out loud."

Standard Mode: It gives you a direct answer.
Full Reasoning Mode: It shows its homework. It writes down every step of its logic (like showing your work in math class) in English or Italian. This helps you trust why it gave the answer.
Turbo Mode: This is the "speed dial." It compresses the thinking process into bullet points. It's like a chef saying, "I chopped, sautéed, and seasoned" instead of writing a 20-page recipe. It's much faster and cheaper to run, perfect for real-time apps.

5. Why It Matters (Sovereignty & Efficiency)

Sovereignty: Europe wants its own AI that follows European rules (the EU AI Act) and values, rather than relying on models built in other countries. EngGPT2 is built from scratch in Europe, with full transparency.
Efficiency: Because it uses the "8 chefs out of 64" trick, it costs one-fifth to one-half the energy to run compared to other models of similar smarts. It's like driving a hybrid car that gets the same mileage as a gas-guzzler but uses half the fuel.

The Verdict

EngGPT2 is a 16-billion-parameter model that punches way above its weight class.

It beats other models of its size (8B–16B) in math, logic, and coding.
It competes with much larger, more expensive models (30B+) but costs a fraction to train and run.
It is the first open model that is truly "European," speaking Italian fluently and respecting local laws.

In short: EngGPT2 proves you don't need to be the biggest, loudest, or most expensive AI to be the smartest. With the right architecture, a focused diet, and a little bit of Italian flair, you can build a model that is powerful, efficient, and ready for the future.

EngGPT2: Sovereign, Efficient and Open Intelligence

1. The "Smart Kitchen" Design (The Architecture)

2. The Secret Recipe (The Data)

3. The Training Camp (The Process)

4. The "Turbo" Mode (Reasoning Styles)

5. Why It Matters (Sovereignty & Efficiency)

The Verdict

1. Problem Statement

2. Methodology

Architecture

Training Pipeline

Inference Modes

3. Key Contributions

4. Results

5. Significance

EngGPT2: Sovereign, Efficient and Open Intelligence

1. The "Smart Kitchen" Design (The Architecture)

2. The Secret Recipe (The Data)

3. The Training Camp (The Process)

4. The "Turbo" Mode (Reasoning Styles)

5. Why It Matters (Sovereignty & Efficiency)

The Verdict

1. Problem Statement

2. Methodology

Architecture

Training Pipeline

Inference Modes

3. Key Contributions

4. Results

5. Significance

More like this

Caption First, VQA Second: Knowledge Density, Not Task Format, Drives Multimodal Scaling

WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain

Text-as-Signal: Quantitative Semantic Scoring with Embeddings, Logprobs, and Noise Reduction

A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews

KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context