Bridging the climate to energy data gap: simulated annealing for representative climate year selection

This study proposes and validates a simulated annealing optimization method, utilizing the seasonal sliced Wasserstein distance, to select highly representative subsets of climate years from large ensembles, significantly outperforming current practices and alternative algorithms to provide robust, unbiased inputs for energy system modeling.

Original authors: Bram van Duinen, Karin van der Wiel, Jean Thorey, Laurens Stoop

Published 2026-05-18✓ Author reviewed
📖 5 min read🧠 Deep dive

Original authors: Bram van Duinen, Karin van der Wiel, Jean Thorey, Laurens Stoop

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you are trying to design a power grid that can handle the weather for the next 30 years. The problem is that the weather is chaotic and unpredictable. Climate scientists have run supercomputer simulations that generated 180 different possible years of weather data to show every possible scenario (from super windy years to droughts).

However, the computer models used to design the actual power grid are very heavy and slow. They can't process 180 years of data at once; they can only handle a tiny handful, maybe 5 or 30 years.

The big question is: Which specific years should we pick?

If you pick the wrong years, you might build a grid that works great in a mild summer but collapses during a cold, windless winter. If you pick the wrong years, you could waste billions of dollars on the wrong infrastructure.

The Problem with Current Methods

Right now, many energy planners pick years somewhat randomly or by just looking at the "average" year. The authors of this paper say this is like trying to understand a whole library by reading just one random page. It often misses the extreme events (like a "Dunkelflaute"—a period of no wind and no sun) that are crucial for planning.

The Solution: A "Smart Search" (Simulated Annealing)

The authors propose a new method called Simulated Annealing.

The Analogy:
Imagine you are in a vast, foggy mountain range, and you want to find the absolute lowest valley (the best set of years).

  • Random Search is like throwing a dart at a map and walking there. You might get lucky, but you'll probably miss the deepest valley.
  • K-Medoids (the old standard) is like grouping the mountains into clusters and picking the center of each group. It's okay, but it might miss the specific shape of the terrain.
  • Simulated Annealing is like a hiker who is smart but also willing to take a risk.
    • The hiker starts at a random spot.
    • They look around. If they find a lower spot, they move there.
    • Crucially: Sometimes, they might take a step uphill (a worse spot) just to see if there is an even deeper valley on the other side of that hill.
    • As the "hike" goes on, they get less willing to take those risky uphill steps and start focusing on finding the absolute bottom.
    • This prevents them from getting stuck in a small, shallow dip (a local minimum) and missing the true lowest point (the global minimum).

How They Measure "Goodness"

How do they know if their chosen 5 or 30 years are actually good? They use a mathematical tool called the Seasonal Sliced Wasserstein Distance.

The Analogy:
Think of the 180 years of weather data as a giant, complex smoothie made of many ingredients (wind, sun, temperature, electricity demand).

  • A simple average might just check if the total amount of strawberries is right.
  • This new tool checks:
    1. The Ingredients: Is the right amount of wind and sun there?
    2. The Mix: Do the ingredients blend correctly? (e.g., Does high wind usually happen with low sun? Or do they happen together?)
    3. The Timing: Is the mix right for winter and summer separately? (A windy summer is great, but a windy winter is even better for heating. If you pick years that are windy in summer but calm in winter, you fail the test).

The tool calculates a "score" of how different your small smoothie (the selected years) is from the giant smoothie (all 180 years). The lower the score, the better the match.

What They Found

The researchers tested their "Smart Search" method against random guessing, filtered guessing, and the old clustering method across three scenarios:

  1. Just the Netherlands (30 years).
  2. All of Europe (30 years).
  3. All of Europe (5 years).

The Results:

  • The Winner: The "Smart Search" (Simulated Annealing) consistently found the best sets of years.
  • The Magic Multiplier: When they picked just 30 years using this method, those 30 years were so representative that they acted like 130 to 140 years of data. They got 4 to 5 times more "value" out of the data than they physically had.
  • Better than Current Practice: The method they used is 2.5 to 3.5 times better than the current standard used by major European energy organizations (ENTSO-E).
  • Consistency: Unlike other methods that rely heavily on "luck" (getting a good result just by chance), this method works reliably every time you run it.

The Bottom Line

This paper doesn't just say "pick better years." It provides a specific, mathematically proven recipe (Simulated Annealing + a specific scoring tool) to ensure that when energy companies build the grid for the future, they aren't gambling on a lucky guess. They are using a tiny, carefully selected sample that perfectly mirrors the complex, chaotic reality of the full climate.

One final note on the "Year": The paper also suggests defining a "year" from April 1st to March 31st (instead of January to December). Why? Because this keeps the winter together in one block. Since winter is the most stressful time for the power grid (heating + less sun), splitting winter across two calendar years would break the data and make it harder to plan for those critical cold snaps.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →