Modeling User Preferences as Distributions for Optimal Transport-Based Cross-Domain Recommendation under Non-Overlapping Settings

This paper proposes DUP-OT, a novel cross-domain recommendation framework that models user preferences as Gaussian Mixture Models and utilizes optimal transport to align these distributions across non-overlapping domains, thereby effectively addressing data sparsity and cold-start issues without relying on shared users or items.

Ziyin Xiao, Toyotaro Suzumura

Published 2026-03-03
📖 5 min read🧠 Deep dive

Imagine you are a travel guide trying to help a tourist (the "Target Domain") navigate a new, confusing city where they have no map and barely know anyone. This is the "Cold Start" problem in recommendation systems: the system doesn't know what the user likes because they are new or have very little history.

Usually, guides try to help by looking at the tourist's past visits to other cities (the "Source Domain"). But here's the catch: in the real world, you often can't link the tourist's past identity to their current one due to privacy rules. You can't say, "Oh, this is the same person who loved jazz in New York, so they'll love jazz here." The names and IDs are different.

This paper, DUP-OT, proposes a clever new way to solve this without needing to link specific people across cities.

The Old Way: The Rigid Checklist

Most recommendation systems treat a person's taste like a fixed checklist.

  • Example: "This user likes 30% Rock, 20% Jazz, and 50% Pop."
  • The Problem: This is too rigid. It's like saying a person is just a single point on a map. If the new city is slightly different, that checklist doesn't fit well. Also, if you can't link the tourist to their past self, you can't copy-paste their checklist.

The New Way: The "Flavor Cloud" (DUP-OT)

The authors suggest we stop thinking of taste as a checklist and start thinking of it as a cloud of flavors (a Gaussian Mixture Model).

Imagine a user's taste isn't a single dot, but a cloud of possibilities.

  • Maybe they are 70% "Rock Cloud" and 30% "Jazz Cloud," but those clouds have a shape and a spread. They might like "Rock" generally, but specifically "90s Rock" or "Indie Rock."
  • Why this helps: Even if you can't link the specific tourist, you can look at the shape of the clouds in the old city and the shape of the clouds in the new city.

The Magic Bridge: Optimal Transport

So, how do we move the "flavor cloud" from the old city to the new one without knowing who is who?

The paper uses a mathematical tool called Optimal Transport. Think of this as a logistics company moving furniture.

  • The Scenario: You have a warehouse in City A (Source) full of furniture (User Preferences) and a warehouse in City B (Target) that needs furniture. You don't know which specific chair belongs to which person, but you know the types of furniture in both warehouses.
  • The Solution: The logistics company calculates the most efficient way to move the types of furniture from City A to City B to fill the gaps.
  • In the Paper: The system looks at the "Rock Clouds" in the source domain and the "Rock Clouds" in the target domain. It calculates the most efficient "transport plan" to align them. It essentially says, "The 'Indie Rock' cloud in the source domain matches best with the 'Alternative Rock' cloud in the target domain," and shifts the user's preferences accordingly.

The Three-Step Recipe

The authors built a system (DUP-OT) that works in three simple stages:

  1. The Translator (Preprocessing):
    First, they take all the reviews people wrote (text) and translate them into a common language (embeddings). They use a shared "dictionary" so that a review about a "movie" in the Source Domain and a review about a "game" in the Target Domain can be understood in the same way.

  2. The Shape Shifter (GMM Modeling):
    Instead of making a checklist for every user, they build a "flavor cloud" for the whole city. They figure out the main "flavor clusters" (e.g., Action, Drama, Comedy) that exist in the data. Then, for each user, they just figure out how much of each flavor cluster they like. This is much lighter and more flexible than a rigid vector.

  3. The Bridge Builder (Optimal Transport):
    This is the magic step. They use the logistics math to align the "flavor clusters" of the Source City with the Target City. Once aligned, they can take a user's "flavor profile" from the Source City and "transport" it to the Target City, giving the new system a head start on what the user might like.

Why It Matters

The authors tested this on Amazon data (like moving from "Digital Music" users to "Electronics" users).

  • The Result: Even without knowing who the users were in the new city, this method predicted ratings much better than systems that just guessed based on the new city's data alone.
  • The Superpower: It was especially good at avoiding disastrous mistakes. If a user is new, a bad system might recommend something they absolutely hate (a huge error). DUP-OT's "cloud" approach is more cautious and robust, ensuring that even if it's not perfect, it won't be terrible.

In a Nutshell

Instead of trying to find a specific person's ID card to copy their history, DUP-OT looks at the general shape of their tastes (the cloud), figures out how those shapes map to the new environment using a smart logistics algorithm (Optimal Transport), and uses that to make a much smarter guess about what they will like next. It's like helping a tourist by understanding their style of travel, rather than needing their passport.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →