Accelerating High-Order Finite Element Simulations at Extreme Scale with FP64 Tensor Cores

This paper presents the first direct programming of FP64 tensor cores on NVIDIA GPUs to accelerate high-order finite element simulations within the MFEM library, achieving up to 2× performance and 83% energy efficiency gains while demonstrating near-perfect weak scaling across nearly 10,000 GPUs on the Alps exascale system.

Jiqun Tu, Ian Karlin, John Camier, Veselin Dobrev, Tzanio Kolev, Stefan Henneking, Omar GhattasWed, 11 Ma💻 cs