ThunderAgent: A Simple, Fast and Program-Aware Agentic Inference System
ThunderAgent is a novel, program-aware agentic inference system that unifies LLM and tool resource management through an "LLM Program" abstraction, achieving significant throughput and memory efficiency gains by optimizing KV cache utilization and enabling asynchronous environment preparation.