MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark
This paper introduces MMTU, a large-scale benchmark comprising over 28,000 questions across 25 real-world expert-level table tasks, designed to comprehensively evaluate and reveal the significant limitations of current frontier models in understanding, reasoning, and manipulating structured tabular data.