On the Statistical Optimality of Optimal Decision Trees
This paper establishes a comprehensive statistical theory for globally optimal empirical risk minimization decision trees by deriving sharp oracle inequalities and minimax optimal rates over a novel piecewise sparse heterogeneous anisotropic Besov space, thereby providing rigorous theoretical guarantees for their performance in high-dimensional regression and classification under both sub-Gaussian and heavy-tailed noise settings.