Prompt Programming for Cultural Bias and Alignment of Large Language Models

This paper validates the persistence of cultural biases in open-weight large language models and demonstrates that using DSPy to optimize prompts as modular programs offers a more stable and transferable approach to achieving cultural alignment than manual prompt engineering.

Maksim Eren, Eric Michalak, Brian Cook, Johnny Seales Jr

Published 2026-03-18
📖 5 min read🧠 Deep dive

Imagine you have a super-smart, all-knowing robot librarian named "LLM." This robot has read almost every book ever written, but there's a catch: it was mostly trained on books from Western countries (like the US, UK, and Europe). Because of this, when you ask the robot a question, it tends to answer with Western values, priorities, and ways of thinking, even if you ask it to speak for someone in a completely different culture, like a farmer in Kenya or a shopkeeper in Japan.

This paper is about a team of researchers at Los Alamos National Laboratory who wanted to fix this "cultural bias" so the robot can be a better, fairer assistant for people all over the world.

Here is the story of what they did, explained simply:

1. The Problem: The Robot's "Default Setting"

The researchers started by testing five different open-source versions of this robot (like Llama and Gemma). They asked them a series of questions similar to a global personality test (called the World Values Survey).

The Analogy: Imagine the robot is a chameleon. When you don't tell it what color to be, it automatically turns "Western Blue." It doesn't matter if you ask it to describe life in Brazil or China; without a specific instruction, it defaults to its own "Western Blue" perspective.

The researchers found that, just like previous studies on expensive, closed robots, these free, open-source robots also had this "Western Blue" default. They were clustered together on a map of human values, far away from many other cultures.

2. The First Fix: "Manual Prompt Engineering" (The Sticky Note)

To fix this, the researchers tried a simple trick. They added a "sticky note" to the front of every question.

  • Without the note: "How happy are you?" (Robot answers with Western values).
  • With the note: "You are a citizen of Egypt. How happy are you?"

The Analogy: This is like telling the chameleon, "Okay, pretend you are a desert lizard." The robot does a better job! It shifts its answers closer to how real people in Egypt actually feel. This is called Prompt Engineering. It works, but it's a bit like manually writing a new sticky note for every single country you visit. It's tedious and might not be perfect.

3. The Big Innovation: "Prompt Programming" (The Smart Auto-Pilot)

The researchers asked: Can we do better than writing sticky notes by hand?

They used a tool called DSPy. Think of DSPy as a "smart auto-pilot" for the robot's instructions. Instead of a human writing the sticky note, they let the computer write and test thousands of different versions of the instruction to see which one works best.

The Analogy:

  • Manual Engineering: You are a chef trying to make a dish taste like "Italy." You taste it, add a pinch of basil, taste it again, add a pinch of oregano. You are doing the work manually.
  • Prompt Programming (DSPy): You give the recipe to a super-fast robot chef. It instantly cooks 1,000 versions of the dish, tastes them all against a "perfect Italian flavor" target, and automatically picks the one that is closest. It then figures out the exact perfect recipe on its own.

4. What They Found

The researchers compared the "Manual Sticky Note" method against the "Smart Auto-Pilot" (DSPy) method.

  • The Result: The Smart Auto-Pilot won. It didn't just nudge the robot closer to the target culture; it often pushed it much further.
  • The Surprise: The auto-pilot was especially good at helping the robot understand cultures that were very different from its Western training data (like countries in Africa or the Middle East). For Western countries, the robot was already close, so the improvement was small. But for distant cultures, the auto-pilot made a huge difference.
  • The Secret Sauce: They found that the "brain" used to write the instructions mattered. Using a very smart, large robot to write the instructions for the smaller robot worked better than using a small robot to write them.

5. Why This Matters

Why should you care?
Because these robots are starting to be used for serious jobs: writing laws, summarizing news, helping governments make decisions, and auditing documents.

If a robot is making decisions for a country in the Middle East but thinks like a person from New York, it might suggest policies that don't make sense or feel unfair to the local people. This paper shows that by using Prompt Programming, we can "tune" these robots to respect and reflect the values of the specific people they are serving, making them more fair and useful tools for everyone, not just the West.

Summary

  • The Issue: AI robots naturally think like Westerners.
  • The Old Fix: Manually telling the robot, "Pretend you are from Country X." (Works okay).
  • The New Fix: Using a smart computer program (DSPy) to automatically write the best possible instructions to make the robot think like Country X. (Works much better).
  • The Goal: To make AI a fair partner for strategic decisions and daily life for people everywhere, regardless of where they live.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →