Adaptive Memory Admission Control for LLM Agents

This paper proposes Adaptive Memory Admission Control (A-MAC), a framework that decomposes memory value into five interpretable factors to enable transparent, efficient, and domain-adaptive long-term memory management for LLM agents, achieving superior precision-recall tradeoffs and reduced latency compared to state-of-the-art systems.

Guilin Zhang, Wei Jiang, Xiejiashan Wang, Aisha Behr, Kai Zhao, Jeffrey Friedman, Xu Chu, Amine Anoun

Published 2026-03-06
📖 4 min read☕ Coffee break read

Imagine you have a brilliant but slightly overwhelmed personal assistant named "Agent." This Agent is great at chatting, solving problems, and using tools. However, it has a major flaw: it has a terrible memory.

Sometimes, it forgets important things you told it yesterday. Other times, it remembers things you never said (hallucinations) or holds onto every single "Hello" and "Thanks" you ever typed, cluttering its brain until it can't find the important stuff.

Current solutions are like two extremes:

  1. The Hoarder: It saves everything. This makes the brain huge, slow, and full of junk.
  2. The Forgetful Genius: It saves nothing unless a very expensive, slow "brain scan" (a complex AI model) tells it to. This is accurate but takes forever and costs a lot of money.

The paper you shared introduces A-MAC (Adaptive Memory Admission Control). Think of A-MAC as a smart, efficient bouncer standing at the door of the Agent's long-term memory. Instead of letting everything in or asking a super-expensive expert to check every single guest, A-MAC uses a quick, five-point checklist to decide who gets in.

Here is how the "Bouncer" works, using five simple rules:

1. The Five-Point Checklist (The "Value Signals")

When a piece of information (a "guest") tries to enter the memory, A-MAC asks five questions:

  • 🔮 Future Utility (Will this be useful later?):
    • Analogy: "If I save this, will I need it for a future task?"
    • How it works: The system uses a quick AI check to see if this fact helps solve future problems or if it's just small talk.
  • 🛡️ Factual Confidence (Did they actually say this?):
    • Analogy: "Is this a proven fact, or is the guest making things up?"
    • How it works: It checks the chat history. If the Agent said it, it's a "maybe." If the User said it, it's a "yes." This stops the Agent from remembering its own lies (hallucinations).
  • ✨ Semantic Novelty (Have we heard this before?):
    • Analogy: "Is this a new story, or are they just repeating the same joke?"
    • How it works: It checks if the memory is already in the database. If it's a duplicate, it gets kicked out to save space.
  • ⏳ Temporal Recency (How fresh is this?):
    • Analogy: "Is this news from today or a rumor from last year?"
    • How it works: Old information slowly fades in value (like milk expiring). Recent chats get a higher score.
  • 🏷️ Content Type Prior (What kind of info is this?):
    • Analogy: "Is this a permanent rule (like 'I hate cilantro') or a temporary mood (like 'I'm tired right now')?"
    • How it works: This is the most important rule. The system knows that "User Preferences" are gold and should always be saved, while "Current Weather" or "Mood" is usually junk that should be forgotten.

2. The Decision Process

The Bouncer doesn't just guess. It gives the guest a score based on these five rules.

  • High Score? The guest gets a VIP pass into long-term memory.
  • Low Score? The guest is politely turned away.

Crucially, the Bouncer is hybrid. It does the heavy lifting (checking facts, novelty, and type) using fast, cheap, simple rules (like a calculator). It only calls in the "expensive expert" (the big AI model) once to check if the information will be useful in the future. This makes the whole process 31% faster than other methods while being smarter.

3. Why This Matters (The Results)

The researchers tested this on a benchmark called LoCoMo (Long-Context Memory). Here is what happened:

  • Better Accuracy: A-MAC was much better at keeping the right memories and forgetting the wrong ones. It improved the "F1 score" (a measure of overall success) to 0.583, beating the previous best methods.
  • Speed: Because it uses simple rules for most checks, it was 31% faster than the competition.
  • No More Hallucinations: By strictly checking if the information was actually said by the user, it stopped the Agent from remembering things that never happened.

The Big Takeaway

Before A-MAC, building a memory for AI agents was like trying to fill a library by throwing books in a pile and hoping the librarian sorts them later. It was messy, slow, and full of errors.

A-MAC is like hiring a professional librarian with a strict, transparent checklist. It knows exactly what to keep, what to throw away, and why. It ensures the AI's brain stays clean, fast, and reliable, so it can actually remember what you told it last week without getting confused by its own daydreams.

In short: A-MAC teaches AI agents how to be good note-takers, not just good talkers.