Green-VLA: Staged Vision-Language-Action Model for Generalist Robots
The paper introduces Green-VLA, a five-stage curriculum framework that combines large-scale multimodal pretraining, embodiment-specific adaptation, and reinforcement learning to enable a single generalist policy to robustly control diverse robotic systems, including the Green humanoid, with enhanced safety and long-horizon efficiency.
I. Apanasevich, M. Artemyev, R. Babakyan, P. Fedotova, D. Grankin, E. Kupryashin, A. Misailidi, D. Nerus, A. Nutalapati, G. Sidorov, I. Efremov, M. Gerasyov, D. Pikurov, Y. Senchenko, S. Davidenko, D. Kulikov, M. Sultankin, K. Askarbek, O. Shamanin, D. Statovoy, E. Zalyaev, I. Zorin, A. Letkin, E. Rusakov, A. Silchenko, V. Vorobyov, S. Sobolnikov, A. Postnikov2026-03-10💻 cs