COP-GEN: Latent Diffusion Transformer for Copernicus Earth Observation Data -- Generation Stochastic by Design

COP-GEN is a multimodal latent diffusion transformer designed for Earth observation that addresses the inherent non-injectivity of cross-sensor relationships by modeling conditional distributions to generate diverse, physically consistent, and uncertainty-aware realizations across optical, radar, and elevation modalities without task-specific retraining.

Miguel Espinosa, Eva Gmelich Meijling, Valerio Marsocci, Elliot J. Crowley, Mikolaj Czerkawski

Published 2026-03-04
📖 5 min read🧠 Deep dive

Imagine you are trying to describe a specific place on Earth to a friend who has never seen it. You tell them, "It's a flat, green field with a small hill in the middle."

If you asked a traditional computer program to draw this, it would likely give you one specific image: a perfectly flat green field with a perfectly round hill. It would be a "safe" guess, the average of all possible fields. But in reality, that description could be a sunny wheat field in France, a rainy pasture in Ireland, or a snowy meadow in Canada. The description is the same, but the reality is different.

This is the problem COP-GEN solves.

The Problem: The "One Answer" Trap

Earth observation (satellite data) is messy. We have optical cameras (like your phone), radar (which sees through clouds), elevation maps (topography), and land-cover maps.

The relationship between these is one-to-many.

  • Input: "A forest on a mountain."
  • Output: It could be a sunny summer forest, a foggy winter forest, a forest with snow, or a forest with a storm.

Older AI models act like a strict librarian who only gives you the "average" book. If you ask for a forest, they give you a blurry, boring picture that looks like every forest and no forest at all. They collapse all the possibilities into one safe, boring answer.

The Solution: COP-GEN (The "Imaginative Artist")

COP-GEN is a new AI model designed by researchers at the University of Edinburgh and the European Space Agency. Instead of trying to guess the one right answer, it learns the entire range of possibilities.

Think of COP-GEN not as a calculator, but as a creative artist who understands the rules of physics.

  • If you show it a map of a mountain and a forest, it doesn't just draw one picture.
  • It says, "Ah, I can paint a sunny version, a foggy version, or a stormy version. All of these are physically possible."
  • It generates multiple, diverse, and realistic versions of the same scene.

How It Works: The "Universal Translator"

The world of satellite data is like a group of people speaking different languages:

  • Optical cameras speak "Visible Light."
  • Radar speaks "Microwaves."
  • Elevation maps speak "Height."
  • Land cover speaks "Types of Ground."

Most AI models struggle to translate between these languages, especially if the data comes in different sizes (some images are high-res, some are low-res).

COP-GEN uses a clever trick called Latent Diffusion Transformers.

  1. The Translator: It first translates all these different "languages" into a common, secret code (called "latent tokens"). It's like converting French, German, and Japanese into a universal "Morse code" that the AI understands.
  2. The Artist: It then uses a powerful "Transformer" (a type of AI brain good at understanding context) to mix these codes together.
  3. The Magic: When you ask it to generate an image, it doesn't just copy-paste. It "denoises" the secret code, slowly turning random static into a clear picture, while respecting the rules you gave it.

Why This Matters: The "Weather Forecast" Analogy

Imagine you are a disaster manager. You have a radar image of a storm, but the optical camera is blocked by clouds. You need to know what the ground looks like underneath to plan a rescue.

  • Old AI: "Here is the average ground." (It might look like a muddy mess, but it misses the specific details needed for rescue).
  • COP-GEN: "Here are five possible scenarios. In Scenario A, it's a muddy river. In Scenario B, it's a flooded road. In Scenario C, it's a dry field."

By giving you options instead of a single guess, COP-GEN helps humans understand the uncertainty. It tells you, "It could be this, or it could be that," which is much more useful for making real-world decisions.

The "Zero-Shot" Superpower

The coolest part? COP-GEN is a universal translator that doesn't need to be retrained for every new job.

  • Want to turn a map into a photo? Done.
  • Want to turn a photo into a radar image? Done.
  • Want to fill in missing colors in a photo? Done.

It's like a Swiss Army knife for satellite data. You don't need a different tool for every job; you just tell it what you have and what you want, and it figures out the rest.

Summary

COP-GEN is a breakthrough because it stops trying to force the chaotic, changing Earth into a single, static box. Instead, it embraces the chaos. It understands that for every piece of data, there are many valid realities. By generating many possibilities instead of one average, it creates a more honest, flexible, and useful tool for understanding our planet.

It's the difference between a robot that says, "I am 100% sure this is a field," and an artist who says, "Based on what I see, it could be any of these beautiful, realistic fields."