PinRec: Outcome-Conditioned, Multi-Token Generative Retrieval for Industry-Scale Recommendation Systems

This paper introduces PinRec, a novel outcome-conditioned, multi-token generative retrieval model developed for Pinterest that successfully balances performance, diversity, and efficiency to meet industrial-scale recommendation needs and multiple business metrics.

Prabhat Agarwal, Anirudhan Badrinath, Laksh Bhasin, Jaewon Yang, Edoardo Botta, Jiajing Xu, Charles Rosenberg

Published 2026-03-05
📖 4 min read☕ Coffee break read

Imagine Pinterest as a massive, endless library of ideas (called "Pins"). Every day, 550 million people walk in, looking for inspiration. The library is so huge that if you just handed a librarian a list of "what you liked yesterday," they might struggle to guess exactly what you want to see next without getting overwhelmed or repeating the same old books.

For years, the library used a "Two-Tower" system. Think of this as two separate filing cabinets: one for what you searched for, and one for the books on the shelves. The librarian would just match the closest file in the cabinet to your search. It works okay, but it's rigid and can't really "think" about your future needs.

PinRec is the library's new, super-intelligent AI librarian. Instead of just matching files, this AI writes a story about what you should see next. Here is how it works, broken down into simple concepts:

1. The "Crystal Ball" Approach (Generative Retrieval)

Old systems were like a matchmaker who only looks at your past dates to find a similar partner.
PinRec is like a fortune teller. It looks at your entire history (what you clicked, saved, searched) and generates a list of ideas that haven't even been shown to you yet. It doesn't just search a database; it creates the perfect recommendation from scratch, token by token, like a writer composing the next chapter of a book.

2. The "Remote Control" (Outcome-Conditioned Generation)

This is PinRec's superpower. In the real world, the library has different goals. Sometimes the boss wants more people to click on ads (passive engagement). Other times, they want people to save ideas to their boards (active engagement).

  • Old Way: The AI had to guess a "middle ground" recommendation that might be okay for both, but great for neither.
  • PinRec Way: The AI has a remote control. The engineers can press a button that says, "Today, we want more Saves," or "Today, we want more Clicks."
    • If they press "Save," the AI generates a list of beautiful, high-quality ideas people love to keep.
    • If they press "Click," it generates trendy, quick-to-consume ideas.
    • The Result: The library can instantly switch its strategy to match the business goal without rebuilding the whole system.

3. The "Speed Reader" (Multi-Token Generation)

Usually, these AI writers are slow. They write one word (or one recommendation) at a time. If you ask for 10 recommendations, the AI has to write, pause, write, pause, 10 times. This is too slow for a busy library.

PinRec uses a Speed Reader technique. Instead of writing one word at a time, it writes multiple words at once.

  • Analogy: Imagine a chef plating food. The old way was plating one dish, waiting for the oven, then plating the next. PinRec is like a chef who can plate four dishes simultaneously.
  • The Benefit: It generates a diverse list of 10 recommendations in the time it used to take to generate just one. This makes the system faster and gives the user a wider variety of ideas to explore.

4. The "Real-World" Test

The paper isn't just theory; they tested this on the actual Pinterest app with hundreds of millions of users.

  • The Result: By using this new AI librarian, Pinterest saw a 2% increase in clicks across the whole site and a 4% increase in people saving ideas from search results.
  • Why it matters: It proved that you can use "generative" AI (the kind that writes stories) for massive, real-world recommendation systems without it being too slow or expensive.

Summary

PinRec is like upgrading a library from a static filing cabinet to a dynamic, shape-shifting storyteller.

  1. It writes recommendations instead of just searching for them.
  2. It listens to a remote control to decide whether to focus on clicks or saves.
  3. It writes in batches to be super fast.
  4. It makes the user happier and the business more successful, all while running efficiently on the world's largest scale.