JanusVLN: Decoupling Semantics and Spatiality with Dual Implicit Memory for Vision-Language Navigation
JanusVLN is a novel Vision-Language Navigation framework that addresses the limitations of explicit semantic memory by introducing a dual implicit neural memory to decouple spatial-geometric and visual-semantic representations, thereby achieving state-of-the-art performance through efficient, compact, and fixed-size neural modeling.