AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications
This paper introduces AMA-Bench, a novel benchmark designed to evaluate long-horizon memory in agentic applications using real-world and synthetic machine-generated trajectories, and proposes AMA-Agent, a causality-driven memory system that significantly outperforms existing baselines by addressing the limitations of current similarity-based retrieval methods.