Earth Embeddings as Products: Taxonomy, Ecosystem, and Standardized Access

This paper addresses the fragmentation of pre-computed Geospatial Foundation Model embeddings by proposing a three-layer taxonomy and extending TorchGeo with a unified API to standardize access, thereby enabling interoperable, reproducible, and accessible Earth observation workflows.

Heng Fang, Adam J. Stewart, Isaac Corley, Xiao Xiang Zhu, Hossein Azizpour

Published 2026-02-25
📖 4 min read☕ Coffee break read

Imagine the Earth as a giant, incredibly complex library. For years, scientists have been trying to read every book in this library to understand our planet's climate, forests, and cities.

Recently, a new technology called Geospatial Foundation Models (GFMs) arrived. Think of these models as super-intelligent librarians who can read a book and instantly summarize its entire story into a single, short "ID card" (a vector of numbers). These ID cards are called Earth Embeddings.

However, there's a huge problem: The library is a mess.

The Problem: A Library with No Catalog

Right now, the ecosystem of these "Earth ID cards" is chaotic.

  • Different Formats: Some librarians write their ID cards on sticky notes, others on index cards, and some on digital tablets that only work on specific computers.
  • Different Resolutions: Some cards describe the whole city, while others describe just a single street corner.
  • Hard to Find: To use a card, you often have to build your own special machine to read it. If you want to compare the "City Card" from Librarian A with the "Street Card" from Librarian B, you can't do it easily because they speak different languages.

This makes it very hard for regular people (practitioners) to use these powerful tools. It's like trying to build a house when every brick comes from a different factory with a different shape and no instructions on how to stack them.

The Solution: A Universal Translator

The authors of this paper, a team of researchers from Sweden, Germany, and the US, decided to fix this mess. They did three main things:

1. They Drew a Map (The Taxonomy)

They organized the chaotic library into a neat three-story building:

  • The Data Floor (The Cards): They sorted the ID cards by how detailed they are.
    • Location Cards: Just tell you "Where" (like a GPS coordinate).
    • Patch Cards: Summarize a whole photo (like a "vibe check" of a neighborhood).
    • Pixel Cards: Describe every single dot in an image (like a high-definition map of every tree).
  • The Tool Floor (The Readers): They looked at the tools people use to test these cards, like scoreboards and competitions.
  • The Value Floor (The Uses): They looked at what people actually do with these cards, like finding similar forests or mapping poverty.

2. They Built a Universal Adapter (TorchGeo)

This is the most important part. The team updated a popular software library called TorchGeo.

  • Before: If you wanted to use a specific Earth ID card, you had to download 4 different software packages, write 100 lines of code, and hope your computer didn't crash.
  • After: With their new update, you can load any of these different ID cards using just one simple command, like plugging a USB drive into a computer. It doesn't matter if the card came from Google, Clay, or a university; the software treats them all the same way.

They turned these complex data products into "first-class citizens," meaning they are now as easy to use as a standard photo or a spreadsheet.

3. They Wrote a User Manual (The Survey)

They created a giant table (Table I and II in the paper) that acts like a menu for a restaurant. It tells you:

  • What ingredients (data) were used to make the dish?
  • Who owns the recipe (license)?
  • Can you cook it yourself (reproducibility)?
  • How big is the portion (resolution)?

Why This Matters

Imagine you are a farmer trying to predict crop yields.

  • Without this paper: You spend months trying to figure out how to download the data, fix the file formats, and write code just to get the data to load. You might give up.
  • With this paper: You open your software, click "Load Earth Embeddings," and instantly see a map of similar fields from around the world. You can immediately start solving your problem.

The Future

The authors are also shouting out a "To-Do List" for the future:

  • Look at the Oceans: Most cards only look at land; we need cards for the sea and sky too.
  • Be Clearer: We need to know why the computer made a certain ID card (explainability).
  • Better File Types: Stop using messy, old file formats; use modern "cloud-native" formats that work better on the internet.

In short: This paper takes a scattered, confusing pile of high-tech Earth data and organizes it into a clean, accessible, and easy-to-use system, allowing anyone to finally unlock the secrets hidden in our planet's data.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →