HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals

This paper introduces Poly2Graph, an automated pipeline for generating HSG-12M, a pioneering 16.7-million-scale dataset of spatial multigraphs derived from non-Hermitian crystal energy spectra, which bridges condensed matter physics and geometry-aware graph learning by preserving vital geometric information often discarded in existing benchmarks.

Xianquan Yan, Hakan Akgün, Kenji Kawaguchi + 2 more2026-03-06🔬 cond-mat.mes-hall

Structured Kolmogorov-Arnold Neural ODEs for Interpretable Learning and Symbolic Discovery of Nonlinear Dynamics

This paper introduces Structured Kolmogorov-Arnold Neural ODEs (SKANODEs), a framework that combines structured state-space modeling with Kolmogorov-Arnold Networks to accurately recover interpretable physical latent states and discover compact symbolic governing equations for nonlinear dynamical systems, outperforming black-box neural ODEs and classical identification methods across synthetic and real-world datasets.

Wei Liu, Kiran Bacsa, Loon Ching Tang + 1 more2026-03-06🔬 physics

Why Reinforcement Fine-Tuning Enables MLLMs Preserve Prior Knowledge Better: A Data Perspective

This paper demonstrates that Reinforcement Fine-Tuning (RFT) outperforms Supervised Fine-Tuning (SFT) in preserving prior knowledge for multimodal large language models by leveraging training data with smaller influence magnitudes and better alignment to the base model's probability landscape, thereby mitigating catastrophic forgetting while enabling effective task adaptation.

Zhihao Zhang, Qiaole Dong, Qi Zhang + 12 more2026-03-06💻 cs

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining

MuRating is a scalable framework that transfers high-quality English data-quality signals to a unified multilingual evaluator via pairwise comparisons and translation, enabling the selection of balanced, high-quality datasets that significantly improve the performance of multilingual large language models on both English and non-English benchmarks.

Zhixun Chen, Ping Guo, Wenhan Han + 10 more2026-03-06💻 cs

Design and Experimental Validation of Sensorless 4-Channel Bilateral Teleoperation for Low-Cost Manipulators

This paper presents a sensorless 4-channel bilateral teleoperation framework that enables stable, high-speed force feedback control on low-cost manipulators through disturbance-observer-based estimation and simplified tuning, ultimately demonstrating that such force-enhanced data significantly improves imitation learning performance.

Koki Yamane, Yunhan Li, Masashi Konosu + 4 more2026-03-06💻 cs

LHM-Humanoid: Learning a Unified Policy for Long-Horizon Humanoid Whole-Body Loco-Manipulation in Diverse Messy Environments

The paper introduces LHM-Humanoid, a unified learning framework and benchmark that employs reinforcement learning and policy distillation to enable humanoid agents to perform robust, long-horizon loco-manipulation tasks across diverse, cluttered environments without relying on pre-trained skill libraries or environment resets.

Haozhuo Zhang, Jingkai Sun, Michele Caprio + 4 more2026-03-06💻 cs

Diffusion-Based Impedance Learning for Contact-Rich Manipulation Tasks

This paper introduces Diffusion-Based Impedance Learning, a framework that combines a Transformer-based diffusion model with energy-consistent impedance control to enable robots to learn and adapt contact-rich manipulation behaviors from teleoperated demonstrations, achieving high-precision performance and robust generalization in tasks like peg-in-hole insertion.

Noah Geiger, Tamim Asfour, Neville Hogan + 1 more2026-03-06💻 cs

Complexity-Regularized Proximal Policy Optimization

This paper introduces Complexity-Regularized Proximal Policy Optimization (CR-PPO), a novel algorithm that replaces standard entropy regularization with a self-regulating complexity term—defined as the product of Shannon entropy and disequilibrium—to maintain beneficial stochasticity while reducing sensitivity to hyperparameter tuning and avoiding the overriding of reward signals.

Luca Serfilippi, Giorgio Franceschelli, Antonio Corradi + 1 more2026-03-06💻 cs