Impression-Aware Recommender Systems

Imagine you are walking through a massive, high-tech supermarket.

In the old days of shopping (and old-school recommendation systems), the store only knew what you bought. If you bought milk, they assumed you liked milk. If you bought cereal, they assumed you liked cereal. They had no idea what you looked at, what you picked up and put back, or what you ignored. They only saw the final transaction.

This paper introduces a new way of thinking: "Impression-Aware Recommender Systems."

Instead of just watching what you buy, imagine the store now has a camera that records everything you look at. It sees the cereal you stared at for ten seconds, the milk you picked up and put back, and the fancy cookies you walked right past without a glance.

Here is the breakdown of the paper using simple analogies:

1. The Core Concept: The "Menu" vs. The "Order"

The Old Way (Interactions): The system only knows your Order. You ordered a burger. They think, "Great, they love burgers!"
The New Way (Impressions): The system knows the Menu you were shown. They see that you were shown a burger, a salad, and a steak. You ordered the burger, but you looked at the steak for a long time and ignored the salad.
The Insight: Just because you didn't order the steak doesn't mean you hate it. Maybe you were full, maybe you were in a hurry, or maybe the burger just looked better. By knowing what was on the menu (the impression), the system can guess your true taste much better.

2. The Three Big Questions the Paper Answers

The authors looked at 43 different research papers to figure out how to use this "Menu" data. They organized their findings into three buckets:

A. The Models (The Chefs)

How do the computers (the chefs) cook with this new ingredient?

The Simple Chefs: Some just use basic rules, like "If a customer looks at a steak 5 times, stop showing them steak."
The Smart Chefs (Deep Learning): Most modern systems use super-smart AI (like a master chef who has tasted every dish in the world) to figure out complex patterns. They look at the whole menu you were shown and guess what you really want.
The Gamers (Reinforcement Learning): Some systems treat recommendations like a video game. They try different menus, see what happens, and learn from the score to get better over time.

B. The Data (The Ingredients)

Where do we get this "Menu" data?

Contextual Data (The Best Ingredient): This is a perfect record. It says: "We showed you [Burger, Steak, Salad]. You clicked the Burger." This is gold because we know exactly what you ignored.
Global Data (The Messy Ingredient): This is a record that says: "We showed you a menu. You bought a burger." But it doesn't say which menu the burger came from. It's like knowing you ate a burger but not knowing if it was from the Italian menu or the American menu. It's useful, but less precise.
The Problem: The paper notes that while we have a lot of data, we don't have enough perfect (Contextual) data available for everyone to use.

C. The Evaluation (The Taste Test)

How do we know if the new system is actually better?

The Trap: If you test a new chef by only giving them the dishes people ordered, you aren't testing their ability to choose. You're just testing if they can copy the order.
The Challenge: To test these new systems properly, we need to simulate the whole experience: "Here is the menu we showed you. Did you like the choices?"
The Bias: The paper warns that these systems can get biased. If the system only shows you popular items, it will think you only like popular items. We need to be careful not to create a "Filter Bubble" where you only see what the system thinks you want, rather than what you might actually enjoy.

3. The "User Fatigue" Metaphor

One of the coolest ideas in the paper is User Fatigue.
Imagine a DJ playing music at a party.

Old System: The DJ only knows what songs people danced to. So, they keep playing the same 3 dance hits.
New System: The DJ sees that people are standing still, looking at their phones, or yawning when the 4th dance hit comes on. Even though no one stopped dancing, the DJ sees the "impression" (the song played) and realizes, "Okay, they are getting tired of this song."
The Result: The new system knows when to switch genres to keep the party alive, preventing people from getting bored and leaving.

4. What's Next? (The Open Questions)

The authors say we are still in the early days. Here is what they think we need to do next:

Stop Guessing: Currently, most systems assume that if you didn't click, you hated it. The paper says: "Wait, maybe you just didn't see it, or you were distracted." We need better ways to figure out if a "no-click" is a "no" or just a "not right now."
More Data: We need more public datasets (open recipes) so researchers can test these ideas without needing secret company data.
Fixing Biases: We need to make sure the system isn't just showing us the same popular things over and over. We need to use the "Menu" data to fix these biases and show us a wider variety of things.

Summary

This paper is a map for a new era of recommendation systems. It tells us that to truly understand what people like, we can't just watch what they buy. We have to watch what they see, what they ignore, and how they react to the whole menu of options presented to them. By doing this, we can build systems that feel less like a robot guessing your order and more like a thoughtful friend who knows your taste perfectly.

1. Problem Statement

Traditional Collaborative Filtering (CF) recommender systems rely heavily on interactions (explicit ratings or implicit clicks/purchases). However, CF suffers from limitations such as the popularity bias, filter bubbles, and the inability to recommend relevant items to cold-start users.
A critical gap exists in the literature regarding impressions (the set of items shown to a user on their screen, also known as exposures, slates, or past recommendations). While impressions contain rich information about user exposure and context, they are often underutilized.

The Core Issue: Research on using impressions is currently dispersed, with inconsistent terminology, distinct interpretations, and a lack of a unified framework.
The Challenge: It is unclear how to interpret "non-interacted" impressions (items shown but not clicked). Are they negative signals (user dislikes), neutral signals (user ignored), or positive signals (user liked but didn't click)? Furthermore, there is no standard taxonomy for models, datasets, or evaluation methodologies specific to this data source.

2. Methodology

The authors conducted a Systematic Literature Review (SLR) to unify the field.

Search Strategy: They queried five academic search engines (ACM DL, IEEE Xplore, etc.) using keywords like "impression," "exposure," and "slate" combined with "recommender system."
Selection Criteria: Papers were filtered for peer-reviewed status, top-tier venues (CORE A*/A, Scimago Q1), and relevance.
- Initial pool: 1,351 papers.
- After criteria filtering: 352 papers.
- Final reviewed set: 43 papers.
Analytical Framework: The authors developed a theoretical framework and three specific taxonomies to categorize the 43 papers:
1. Model-Centric Taxonomy: Classifies the algorithmic design (Heuristics, Statistical, Machine Learning, Deep Learning, Reinforcement Learning).
2. Data-Centric Taxonomy: Classifies how impressions are processed as input (Features extraction, Learning directly from impressions, or Sampling items).
3. Signal-Centric Taxonomy: Classifies how the system treats non-interacted impressions (Assuming a signal vs. Learning the signal).

3. Key Contributions

A. Definition of a New Paradigm: IARS

The paper formally defines Impression-Aware Recommender Systems (IARS) as a distinct learning paradigm within Collaborative Filtering with Side Information (CF-SI).

Theoretical Framework: The authors define IARS using a mathematical notation where an event is a quadruplet: $(u, i, \tilde{r}_{u,i}, \vec{e}_{u,i})$ , representing a user, an item, a predicted relevance score, and a vector of item identifiers (the impression).
Differentiation: They distinguish IARS from Context-Aware Recommender Systems (CARS). In CARS, context (e.g., location) is an external attribute added during prediction. In IARS, the impression itself is part of the user profile history used for learning and prediction, creating a unique feedback loop where the system learns from what it showed, not just what the user did.

B. Comprehensive Taxonomies

The paper introduces three classification systems used to analyze the 43 reviewed papers:

Model-Centric:
- Deep Learning (38.6%) and Reinforcement Learning (22.7%) are the dominant modern approaches.
- Older papers relied on Heuristics (e.g., frequency capping) and Statistical methods.
Data-Centric:
- Learn (41.0%): Models take the full impression vector as input.
- Features (22.7%): Models compute statistics (e.g., count of impressions) from impressions.
- Features & Learn (25.0%): A combination of both.
Signal-Centric:
- Assume (63.6%): Most papers assume non-interacted impressions are negative signals (the "missing as negative" assumption).
- Learn (25.0%): A minority of papers attempt to learn the true signal (positive, neutral, or negative) from the data.

C. Dataset Analysis

The authors cataloged 13 public datasets containing impressions:

Contextual Impressions (3 datasets): Contain the link between an interaction and the specific impression it came from (e.g., MIND, ContentWise). These are highly valuable for debiasing.
Global Impressions (10 datasets): Contain interactions and impressions but lack the link (e.g., Yahoo! R6A/B). These limit the ability to model position bias or specific slate effects.
Trend: The number of available public datasets has grown significantly since 2020, moving from 5 to 13.

D. Evaluation Methodologies

The paper critiques current evaluation practices:

Conflict of Goals: Many papers aim to "extract signals" but use evaluation methods designed for "improving recommendation quality" (e.g., using impressions at test time for end-to-end models, which is methodologically flawed).
Bias: The "missing as negative" assumption is identified as a major source of bias, as non-interaction can stem from factors other than dislike (e.g., position bias, fatigue, or lack of awareness).

4. Results & Findings

Trend Analysis: There is a sharp increase in IARS research, with nearly half of the reviewed papers published between 2022 and 2023. The field has shifted from simple heuristics to complex Deep Learning and RL models.
Signal Handling: The majority of the community still treats non-interacted impressions as negative feedback. The authors argue this is suboptimal and fails to leverage the full potential of impressions to understand user fatigue or position bias.
Scalability: Impressions are significantly more abundant than interactions (ratios ranging from 1.8x to 143x). This creates scalability challenges for models that process full impression vectors.
Model Gaps: The review found a lack of Graph-based and Factorization Machine models specifically designed for impressions, despite their success in other recommendation domains.

5. Significance and Future Directions

This paper serves as the first systematic unification of the field of impression-aware recommendation. Its significance lies in:

Standardization: Providing a common language and theoretical framework (IARS) to replace fragmented terminology.
Guidance: Offering a clear roadmap for researchers on how to select datasets, design models, and evaluate systems correctly.
Future Research Directions:
- Debiasing: Using impressions to better calculate propensity scores and correct for exposure bias (e.g., Inverse Propensity Weighting).
- User Fatigue: Modeling how repeated exposure to the same item leads to user disinterest, using impression frequency as a feature.
- Signal Disentanglement: Moving beyond the binary "positive/negative" assumption to learn nuanced signals (e.g., neutral vs. negative) from non-interacted items.
- Data Availability: Urging the release of more contextual public datasets that include the origin of the impression (e.g., was it a search result or a recommendation?).

In conclusion, the paper argues that Impression-Aware Recommender Systems represent a necessary evolution beyond traditional interaction-based models, offering a path to more accurate, fair, and context-aware personalization, provided the community addresses the challenges of signal interpretation and dataset availability.