A Miniature Brain Transformer: Thalamic Gating, Hippocampal Lateralization, Amygdaloid Salience, and Prefrontal Working Memory in Attention-Coupled Latent Memory

This paper introduces a miniature brain transformer architecture that demonstrates a novel, falsifiable prediction: functional lateralization of hippocampal banks requires the synergistic interaction of a prefrontal working-memory buffer (acting as a symmetry-breaker) and inhibitory callosal coupling, a mechanism that triggers a sharp phase transition in memory performance while a cerebellar fast-path merely accelerates convergence.

Hong Jeong2026-03-10💻 cs

VINO: Video-driven Invariance for Non-contextual Objects via Structural Prior Guided De-contextualization

VINO is a self-supervised learning framework that overcomes the "co-occurrence trap" in dense video by using a teacher-student distillation approach with structural priors to force representations to focus on foreground objects rather than background context, achieving state-of-the-art unsupervised object discovery performance.

Seul-Ki Yeom, Marcel Simon, Eunbin Lee, Tae-Ho Kim2026-03-10💻 cs

Reinforcement Learning for Vehicle-to-Grid Voltage Regulation: Single-Hub to Multi-Hub Coordination with Battery-Aware Constraints

This paper proposes a soft actor-critic-based reinforcement learning framework for vehicle-to-grid voltage regulation that effectively coordinates single and multi-hub charging systems while prioritizing battery health and fleet availability, demonstrating robust performance comparable to standard droop controllers under both nominal and aggressive overloading conditions.

Jingbo Wang, Roshni Anna Jacob, Harshal D. Kaushik, Jie Zhang2026-03-10💻 cs

LEPA: Learning Geometric Equivariance in Satellite Remote Sensing Data with a Predictive Architecture

This paper introduces LEPA, a learned architecture that conditions on geometric augmentations to accurately predict transformed satellite image embeddings, effectively overcoming the limitations of standard interpolation in non-convex geospatial foundation model manifolds and significantly improving geometric adjustment performance.

Erik Scheurer, Rocco Sedona, Stefan Kesselheim, Gabriele Cavallaro2026-03-10💻 cs

Seeing the Context: Rich Visual Context-Aware Speech Recognition via Multimodal Reasoning

This paper introduces VASR, a multimodal reasoning framework for Context-Aware Visual Speech Recognition (CAVSR) that leverages an Audio-Visual Chain-of-Thought (AV-CoT) to explicitly ground acoustic signals with rich visual context like scenes and on-screen text, thereby overcoming single-modality dominance and achieving state-of-the-art performance.

Wenjie Tian, Mingchen Shao, Bingshen Mu, Xuelong Geng, Chengyou Wang, Yujie Liao, Zhixian Zhao, Ziyu Zhang, Jingbin Hu, Mengqi Wei, Lei Xie2026-03-10💻 cs