cs.LG papers | Gist.Science

Active Advantage-Aligned Online Reinforcement Learning with Offline Data

This paper introduces A3RL, a novel framework that integrates offline and online reinforcement learning through a confidence-aware active advantage-aligned sampling strategy to dynamically prioritize high-value data, thereby overcoming challenges like catastrophic forgetting and improving sample efficiency to outperform existing methods.

Xuefeng Liu, Hung T. C. Le, Siyu Chen, Rick Stevens, Zhuoran Yang, Matthew R. Walter, Yuxin Chen2026-03-10🤖 cs.LG

Language in the Flow of Time: Time-Series-Paired Texts Weaved into a Unified Temporal Narrative

This paper introduces Texts as Time Series (TaTS), a novel framework that leverages the periodic alignment between paired texts and time series data to enhance multimodal forecasting and imputation performance in existing numerical-only models without requiring architectural changes.

Zihao Li, Xiao Lin, Zhining Liu, Jiaru Zou, Ziwei Wu, Lecheng Zheng, Dongqi Fu, Yada Zhu, Hendrik Hamann, Hanghang Tong, Jingrui He2026-03-10🤖 cs.LG

LaVCa: LLM-assisted Visual Cortex Captioning

The paper proposes LaVCa, a novel data-driven approach that leverages large language models to generate natural-language captions for images, thereby providing more accurate and detailed interpretations of human visual cortex voxel selectivity and revealing fine-grained functional differentiation within the visual cortex compared to existing deep neural network-based methods.

Takuya Matsuyama, Shinji Nishimoto, Yu Takagi2026-03-10🤖 cs.LG

Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective

This paper proposes a Clustering-On-Difficulty (COD) framework that groups tasks by their difficulty scaling features to predict downstream LLM performance with high accuracy (1.55% error), effectively addressing challenges like emergent capabilities and inconsistent scaling patterns.

Chengyin Xu, Kaiyuan Chen, Xiao Li, Ke Shen, Chenggang Li2026-03-10🤖 cs.LG

Subclass Classification of Gliomas Using MRI Fusion Technique

This study proposes a high-accuracy glioma subclass classification framework that fuses 2D and 3D UNET-segmented multimodal MRI images using weighted averaging and classifies them via a pre-trained ResNet50 model, achieving a 99.25% accuracy rate.

Kiranmayee Janardhan, Christy Bobby Thomas2026-03-10💻 cs

A Simple and Effective Reinforcement Learning Method for Text-to-Image Diffusion Fine-tuning

This paper proposes Leave-One-Out PPO (LOOP), a novel reinforcement learning method that combines variance reduction techniques from REINFORCE with the robustness of PPO to achieve a superior balance between sample efficiency and performance in fine-tuning text-to-image diffusion models.

Shashank Gupta, Chaitanya Ahuja, Tsung-Yu Lin + 4 more2026-03-10🤖 cs.AI

Go Beyond Your Means: Unlearning with Per-Sample Gradient Orthogonalization

This paper introduces OrthoGrad, a novel machine unlearning method that effectively removes the influence of specific data by projecting unlearn gradients onto the subspace orthogonal to retain gradients, thereby mitigating interference and outperforming existing approaches even when only a small portion of the training data is available.

Aviv Shamsian, Eitan Shaar, Aviv Navon, Gal Chechik, Ethan Fetaya2026-03-10🤖 cs.LG

LLM-Powered Prediction of Hyperglycemia and Discovery of Behavioral Treatment Pathways from Wearables and Diet

This paper introduces GlucoLens, an explainable machine learning system that integrates wearable sensor data, diet, and activity logs to accurately predict postprandial hyperglycemia and recommend personalized behavioral interventions for managing blood glucose levels.

Abdullah Mamun, Asiful Arefeen, Susan B. Racette + 4 more2026-03-10🤖 cs.AI

IMPACT: Intelligent Motion Planning with Acceptable Contact Trajectories via Vision-Language Models

The paper proposes IMPACT, a novel motion planning framework that leverages Vision-Language Models to infer environment semantics and generate anisotropic cost maps, enabling a contact-aware A* planner to safely navigate cluttered environments by distinguishing between acceptable and dangerous object contacts.

Yiyang Ling, Karan Owalekar, Oluwatobiloba Adesanya, Erdem Bıyık, Daniel Seita2026-03-10🤖 cs.LG

Characterizing Nonlinear Dynamics via Smooth Prototype Equivalences

This paper introduces Smooth Prototype Equivalences (SPE), a framework utilizing invertible neural networks to match sparse, noisy empirical observations to prototypical dynamical behaviors, enabling the equation-free identification of invariant structures and classification of dynamical regimes in complex biological and physical systems.

Roy Friedman, Noa Moriel, Matthew Ricci, Guy Pelc, Yair Weiss, Mor Nitzan2026-03-10🤖 cs.LG

MUSS: Multilevel Subset Selection for Relevance and Diversity

This paper introduces MUSS, a novel multilevel subset selection method that significantly improves both the scalability and performance of relevant and diverse selection tasks in applications like recommender systems and RAG, while providing a constant factor approximation guarantee and a tighter theoretical bound for existing distributed approaches.

Vu Nguyen, Andrey Kan2026-03-10🤖 cs.LG

More Bang for the Buck: Process Reward Modeling with Entropy-Driven Uncertainty

The paper introduces EDU-PRM, an entropy-driven process reward model that automatically identifies reasoning step boundaries using predictive entropy to eliminate manual annotations, achieving state-of-the-art performance with only 1.5% of the training data while significantly improving accuracy and reducing token usage.

Lang Cao, Renhong Chen, Yingtian Zou, Chao Peng, Huacong Xu, Yuxian Wang, Wu Ning, Qian Chen, Mofan Peng, Zijie Chen, Peishuo Su, Yitong Li2026-03-10🤖 cs.LG

Enhancing Metabolic Syndrome Prediction with Hybrid Data Balancing and Counterfactuals

This paper proposes MetaBoost, a novel hybrid framework combining SMOTE, ADASYN, and CTGAN to optimize data balancing for enhanced Metabolic Syndrome prediction, while utilizing counterfactual analysis to identify blood glucose and triglycerides as the most critical modifiable risk factors.

Sanyam Paresh Shah, Abdullah Mamun, Shovito Barua Soumma + 1 more2026-03-10🤖 cs.AI

Estimating Item Difficulty Using Large Language Models and Tree-Based Machine Learning Algorithms

This study demonstrates that while Large Language Models can directly estimate item difficulty for K-5 assessments, a hybrid approach combining LLM-extracted cognitive and linguistic features with tree-based machine learning algorithms yields significantly higher predictive accuracy, offering a scalable alternative to resource-intensive field testing.

Pooya Razavi, Sonya Powers2026-03-10🤖 cs.LG

A Champion-level Vision-based Reinforcement Learning Agent for Competitive Racing in Gran Turismo 7

This paper introduces a vision-based reinforcement learning agent that achieves champion-level performance in Gran Turismo 7 by utilizing an asymmetric actor-critic framework to rely solely on ego-centric camera views and onboard sensors, thereby eliminating the need for external global localization while outperforming the game's built-in drivers.

Hojoon Lee, Takuma Seno, Jun Jet Tai, Kaushik Subramanian, Kenta Kawamoto, Peter Stone, Peter R. Wurman2026-03-10🤖 cs.LG

Structural Inference: Interpreting Small Language Models with Susceptibilities

This paper introduces a linear response framework that models neural networks as Bayesian statistical mechanical systems to efficiently compute susceptibility-based attribution scores, revealing a low-rank structure that isolates functional modules like multigram and induction heads in small transformers.

Garrett Baker, George Wang, Jesse Hoogland, Daniel Murfet2026-03-10🤖 cs.LG

Learning to Rank Critical Road Segments via Heterogeneous Graphs with Origin-Destination Flow Integration

This paper proposes HetGL2R, a heterogeneous graph learning framework that integrates origin-destination flows, routes, and network topology via a tripartite graph and attribute-guided nodes to effectively rank critical road segments by capturing long-range spatial dependencies and functional similarities.

Ming Xu, Jinrong Xiang, Zilong Xie + 1 more2026-03-10🤖 cs.LG

From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review

This paper presents a comprehensive review that consolidates fragmented evaluation efforts into a unified taxonomy of approximately 60 benchmarks, surveys AI-agent frameworks and collaboration protocols, and explores real-world applications and future research directions for autonomous AI agents.

Mohamed Amine Ferrag, Norbert Tihanyi, Merouane Debbah2026-03-10🤖 cs.LG

StablePCA: Distributionally Robust Learning of Shared Representations from Multi-Source Data

This paper introduces StablePCA, a distributionally robust framework for extracting shared low-dimensional representations from multi-source data by maximizing worst-case explained variance, and addresses its inherent nonconvexity through a convex relaxation solved by an efficient Mirror-Prox algorithm with global convergence guarantees and a data-dependent certificate for solution tightness.

Zhenyu Wang, Molei Liu, Jing Lei, Francis Bach, Zijian Guo2026-03-10🤖 cs.LG

Reinforcement Learning for Individual Optimal Policy from Heterogeneous Data

This paper proposes a penalized pessimistic personalized policy learning (P4L) framework that leverages individual latent variables to derive optimal policies for heterogeneous populations from offline data, achieving fast regret rates under weak coverage assumptions and outperforming existing methods in both simulations and real-world applications.

Rui Miao, Babak Shahbaba, Annie Qu2026-03-10🤖 cs.LG

← Previous Next →