On the Value of Tokeniser Pretraining in Physics Foundation Models
This paper demonstrates that pretraining tokenizers with an autoencoding objective before training dynamics models significantly enhances the computational efficiency and accuracy of physics foundation models, particularly when the pretraining data aligns with the downstream physical system.