On-Average Stability of Multipass Preconditioned SGD and Effective Dimension
This paper establishes a new on-average stability analysis for multipass Preconditioned SGD to derive generalization bounds dependent on effective dimension, revealing how mismatches between population risk curvature and gradient noise geometry can lead to suboptimal performance if preconditioning is improperly chosen.