Mesh-Pro: Asynchronous Advantage-guided Ranking Preference Optimization for Artist-style Quadrilateral Mesh Generation
This paper introduces Mesh-Pro, an asynchronous online reinforcement learning framework featuring Advantage-guided Ranking Preference Optimization (ARPO) and novel mesh tokenization techniques, which significantly improves training efficiency and achieves state-of-the-art performance in artist-style quadrilateral mesh generation.