cs.LG papers | Gist.Science

CuriousBot: Interactive Mobile Exploration via Actionable 3D Relational Object Graph

This paper introduces CuriousBot, a mobile exploration system that utilizes a 3D relational object graph to enable active interaction with diverse objects in complex environments, outperforming vision-language model-based approaches in both effectiveness and generalization.

Yixuan Wang, Leonor Fermoselle, Tarik Kelestemur, Jiuguang Wang, Yunzhu LiWed, 11 Ma🤖 cs.LG

Morphological-Symmetry-Equivariant Heterogeneous Graph Neural Network for Robotic Dynamics Learning

This paper introduces MS-HGNN, a morphological-symmetry-equivariant heterogeneous graph neural network that integrates robotic kinematic structures and symmetries as architectural constraints to achieve high generalizability and efficiency in learning dynamics for various multi-body systems, with its effectiveness validated through formal proofs and experiments on quadruped robots.

Fengze Xie, Sizhe Wei, Yue Song, Yisong Yue, Lu GanWed, 11 Ma🤖 cs.LG

Prognostics for Autonomous Deep-Space Habitat Health Management under Multiple Unknown Failure Modes

This paper proposes an unsupervised prognostics framework that utilizes unlabeled run-to-failure data to simultaneously identify latent failure modes and select informative sensors, thereby enabling accurate remaining useful life prediction for autonomous deep-space habitats under multiple unknown failure conditions.

Benjamin Peters, Ayush Mohanty, Xiaolei Fang, Stephen K. Robinson, Nagi GebraeelWed, 11 Ma🤖 cs.LG

SPDIM: Source-Free Unsupervised Conditional and Label Shift Adaptation in EEG

This paper proposes SPDIM, a parameter-efficient geometric deep learning framework that leverages information maximization on the symmetric positive definite (SPD) manifold to effectively address source-free unsupervised domain adaptation in EEG data, specifically overcoming the generalization limitations of prior methods when faced with label shifts.

Shanglin Li, Motoaki Kawanabe, Reinmar J. KoblerWed, 11 Ma🤖 cs.LG

Adaptive and Stratified Subsampling for High-Dimensional Robust Estimation

This paper introduces Adaptive Importance Sampling and Stratified Subsampling estimators that achieve minimax-optimal rates for robust high-dimensional sparse regression under heavy-tailed noise, contamination, and temporal dependence, while also providing fully specified de-biasing procedures for valid confidence intervals and demonstrating superior empirical performance over uniform subsampling.

Prateek Mittal, Joohi ChauhanWed, 11 Ma🤖 cs.LG

Calabi-Yau metrics through Grassmannian learning and Donaldson's algorithm

This paper proposes a novel machine learning framework that combines gradient descent on the Grassmannian manifold with Donaldson's algorithm to efficiently compute Ricci-flat approximations of Calabi-Yau metrics, demonstrating its effectiveness and the emergence of nontrivial local minima on the Dwork family of threefolds.

Carl Henrik Ek, Oisin Kim, Challenger MishraWed, 11 Ma🤖 cs.LG

Learning responsibility allocations for multi-agent interactions: A differentiable optimization approach with control barrier functions

This paper proposes a data-driven framework combining control barrier functions and differentiable optimization to learn interpretable responsibility allocations, enabling autonomous agents to quantify and adjust their behaviors for safe, socially-aware multi-agent interactions.

Isaac Remy, David Fridovich-Keil, Karen LeungWed, 11 Ma🤖 cs.LG

Enhancing Computational Efficiency in Multiscale Systems Using Deep Learning of Coordinates and Flow Maps

This paper proposes a deep learning framework that jointly discovers optimal coordinates and flow maps to enable precise, computationally efficient time-stepping for multiscale systems, achieving state-of-the-art predictive accuracy with reduced costs on complex models like the Fitzhugh-Nagumo neuron and Kuramoto-Sivashinsky equations.

Asif Hamid, Danish Rafiq, Shahkar Ahmad Nahvi, Mohammad Abid BazazWed, 11 Ma🤖 cs.LG

DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking

The paper introduces DUEL, a framework that enables exact likelihood computation for masked diffusion models under the test-time distribution, revealing that their true performance significantly surpasses previous estimates and establishing a new standard for comparing and optimizing parallel text generation.

Gilad Turok, Chris De Sa, Volodymyr KuleshovWed, 11 Ma🤖 cs.LG

Detecting Transportation Mode Using Dense Smartphone GPS Trajectories and Transformer Models

This paper introduces SpeedTransformer, a novel Transformer-based model that utilizes only speed inputs from dense smartphone GPS trajectories to achieve superior accuracy and transferability in transportation mode detection compared to traditional deep learning approaches.

Yuandong Zhang, Othmane Echchabi, Tianshu Feng, Wenyi Zhang, Hsuai-Kai Liao, Charles ChangWed, 11 Ma🤖 cs.LG

The Affine Divergence: Aligning Activation Updates Beyond Normalisation

This paper identifies a systematic mismatch in activation updates during gradient descent, proposing that correcting this issue through first-principles derivation not only reinterprets the role of normalization but also yields a new, empirically superior alternative called "PatchNorm" that challenges conventional affine-based approaches.

George BirdWed, 11 Ma🤖 cs.LG

Directional Textual Inversion for Personalized Text-to-Image Generation

This paper introduces Directional Textual Inversion (DTI), a method that improves personalized text-to-image generation by constraining learned token embeddings to a fixed magnitude and optimizing only their direction on a hypersphere, thereby preventing norm inflation, enhancing text fidelity, and enabling smooth semantic interpolation between concepts.

Kunhee Kim, NaHyeon Park, Kibeom Hong, Hyunjung ShimWed, 11 Ma🤖 cs.LG

SA $^{2}$ GFM: Enhancing Robust Graph Foundation Models with Structure-Aware Semantic Augmentation

This paper introduces SA $^{2}$ GFM, a robust Graph Foundation Model framework that enhances domain-adaptive representations and generalization by integrating structure-aware semantic augmentation, an information bottleneck mechanism, and expert adaptive routing to effectively mitigate domain noise, structural perturbations, and adversarial attacks.

Junhua Shi, Qingyun Sun, Haonan Yuan, Xingcheng FuWed, 11 Ma🤖 cs.LG

Bradley-Terry Policy Optimization for Generative Preference Modeling

This paper introduces Bradley-Terry Policy Optimization (BTPO), a novel framework that derives a consistent Monte Carlo gradient estimator to effectively train large language models with chain-of-thought reasoning on non-verifiable pairwise preference tasks, overcoming the limitations of existing heuristic RL approaches.

Shengyu Feng, Yun He, Shuang Ma, Beibin Li, Yuanhao Xiong, Songlin Li, Karishma Mandyam, Julian Katz-Samuels, Shengjie Bi, Licheng Yu, Hejia Zhang, Karthik Abinav Sankararaman, Han Fang, Yiming Yang, Manaal FaruquiWed, 11 Ma🤖 cs.LG

Improved Robustness of Deep Reinforcement Learning for Control of Time-Varying Systems by Bounded Extremum Seeking

This paper proposes a hybrid control framework that combines Deep Reinforcement Learning (DRL) with robust model-independent bounded extremum seeking to enhance the stability and adaptability of controlling nonlinear time-varying systems, demonstrating its effectiveness through numerical simulations and the automatic tuning of a particle accelerator.

Shaifalee Saxena, Alan Williams, Rafael Fierro, Alexander ScheinkerWed, 11 Ma🤖 cs.LG

ZeroSiam: An Efficient Asymmetry for Test-Time Entropy Optimization without Collapse

This paper introduces ZeroSiam, an efficient asymmetric Siamese architecture that prevents model collapse during test-time entropy minimization by employing asymmetric divergence alignment, thereby enhancing adaptation and reasoning performance across diverse vision and language tasks with negligible overhead.

Guohao Chen, Shuaicheng Niu, Deyu Chen, Jiahao Yang, Zitian Zhang, Mingkui Tan, Pengcheng Wu, Zhiqi ShenWed, 11 Ma🤖 cs.LG

Kuramoto Orientation Diffusion Models

This paper introduces Kuramoto Orientation Diffusion Models, a score-based generative framework that leverages stochastic Kuramoto dynamics on periodic domains to effectively model orientation-rich images by replacing standard isotropic diffusion with synchronization-based forward processes and desynchronization-based reverse generation.

Yue Song, T. Anderson Keller, Sevan Brodjian, Takeru Miyato, Yisong Yue, Pietro Perona, Max WellingWed, 11 Ma🤖 cs.LG

A Surrogate model for High Temperature Superconducting Magnets to Predict Current Distribution with Neural Network

This paper presents a fully connected residual neural network (FCRN) surrogate model trained on finite element method data to rapidly and accurately predict current density distributions and optimize the design of large-scale high-temperature superconducting magnets, overcoming the computational limitations of traditional simulations.

Mianjun Xiao, Peng Song, Yulong Liu, Cedric Korte, Ziyang Xu, Jiale Gao, Jiaqi Lu, Haoyang Nie, Qiantong Deng, Timing QuWed, 11 Ma🤖 cs.LG

Iterative In-Context Learning to Enhance LLMs Abstract Reasoning: The Case-Study of Algebraic Tasks

This paper proposes an iterative in-context learning methodology that optimizes few-shot example selection to significantly enhance large language models' systematic generalization and reasoning capabilities on algebraic tasks with non-standard rules, revealing that simpler examples can sometimes outperform complex ones.

Stefano Fioravanti, Matteo Zavatteri, Roberto Confalonieri, Kamyar Zeinalipour, Paolo Frazzetto, Alessandro Sperduti, Nicolò NavarinWed, 11 Ma🤖 cs.LG

RF-Informed Graph Neural Networks for Accurate and Data-Efficient Circuit Performance Prediction

This paper introduces a lightweight, data-efficient Graph Neural Network framework that leverages RFIC domain-informed feature indexing and device-terminal graph abstractions to achieve state-of-the-art accuracy and superior cross-topology generalization in predicting the performance of diverse active radio frequency circuits.

Anahita Asadi, Leonid Popryho, Inna Partin-VaisbandWed, 11 Ma🤖 cs.LG

← Previous Next →

cs.LG