DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use
The paper introduces DIVE, an evidence-driven framework that prioritizes executing diverse real-world tools before reverse-deriving tasks to ensure grounding and structural variety, which significantly enhances the out-of-distribution generalization of tool-using LLMs compared to traditional quantity-focused scaling.