Over-Searching in Search-Augmented Large Language Models
This paper systematically evaluates the phenomenon of "over-searching" in search-augmented large language models, where unnecessary tool invocation harms efficiency and accuracy, and proposes the Tokens Per Correctness (TPC) metric along with mitigation strategies to address this issue.