cs.LG papers | Gist.Science

Consensus is Not Verification: Why Crowd Wisdom Strategies Fail for LLM Truthfulness

The paper demonstrates that unlike in domains with external verifiers, scaling inference compute through crowd wisdom strategies fails to improve LLM truthfulness in unverified settings because correlated model errors and the inability to distinguish social prediction from truth verification cause aggregation to reinforce shared misconceptions rather than identify correct answers.

Yegor Denisov-Blanch, Joshua Kazdan, Jessica Chudnovsky, Rylan Schaeffer, Sheng Guan, Soji Adeshina, Sanmi Koyejo2026-03-10🤖 cs.LG

OptiRoulette Optimizer: A New Stochastic Meta-Optimizer for up to 5.3x Faster Convergence

The paper introduces OptiRoulette, a stochastic meta-optimizer that dynamically selects update rules from a pool during training, demonstrating significantly faster convergence and higher test accuracy across multiple image-classification benchmarks compared to a standard AdamW baseline.

Stamatis Mastromichalakis2026-03-10🤖 cs.LG

Correlation Analysis of Generative Models

This paper proposes a unified linear representation for diffusion models and flow matching to theoretically demonstrate that the often weak correlation between noisy data and predicted targets in existing methods may adversely impact the learning process.

Zhengguo Li, Chaobing Zheng, Wei Wang2026-03-10🤖 cs.LG

Annealed Co-Generation: Disentangling Variables via Progressive Pairwise Modeling

This paper proposes Annealed Co-Generation (ACG), a framework that replaces high-dimensional joint diffusion modeling with a low-dimensional, pairwise approach coupled through a three-stage annealing process to achieve efficient and consistent multivariate co-generation for scientific applications like flow-field completion and antibody generation.

Hantao Zhang, Jieke Wu, Mingda Xu, Xiao Hu, Yingxuan You, Pascal Fua2026-03-10🤖 cs.LG

RACER: Risk-Aware Calibrated Efficient Routing for Large Language Models

RACER is a novel, model-agnostic routing framework that addresses misrouting risks in multi-LLM systems by formulating the problem as $\alpha$ -VOR to generate calibrated, variable-sized model sets with rigorous distribution-free risk control, thereby enhancing downstream accuracy and cost-performance trade-offs.

Sai Hao, Hao Zeng, Hongxin Wei, Bingyi Jing2026-03-10🤖 cs.LG

Evo: Autoregressive-Diffusion Large Language Models with Evolving Balance

The paper introduces Evo, a novel large language model that unifies autoregressive and diffusion-based generation within a continuous evolutionary latent framework, enabling adaptive balancing of planning and refinement to achieve state-of-the-art performance across diverse benchmarks while maintaining fast inference speeds.

Junde Wu, Minhao Hu, Jiayuan Zhu, Yuyuan Liu, Tianyi Zhang, Kang Li, Jingkun Chen, Jiazhen Pan, Min Xu, Yueming Jin2026-03-10🤖 cs.LG

Distilling and Adapting: A Topology-Aware Framework for Zero-Shot Interaction Prediction in Multiplex Biological Networks

This paper proposes a novel topology-aware framework that leverages domain-specific foundation models, a graph tokenizer for multiplex connectivity, and knowledge distillation to achieve robust zero-shot interaction prediction in multiplex biological networks, outperforming state-of-the-art methods.

Alana Deng, Sugitha Janarthanan, Yan Sun, Zihao Jing, Pingzhao Hu2026-03-10🤖 cs.LG

Not all tokens are needed(NAT): token efficient reinforcement learning

The paper introduces NAT (Not All Tokens Are Needed), a token-efficient reinforcement learning framework that utilizes unbiased partial-token gradient estimation via Horvitz-Thompson reweighting to achieve full-sequence performance with significantly reduced compute and memory costs by updating policies on only a subset of generated tokens.

Hejian Sang, Yuanda Xu, Zhengze Zhou, Ran He, Zhipeng Wang2026-03-10🤖 cs.LG

GraphSkill: Documentation-Guided Hierarchical Retrieval-Augmented Coding for Complex Graph Reasoning

GraphSkill is an agentic framework that improves complex graph reasoning by leveraging hierarchical document retrieval and self-debugging with generated test cases, validated on a new comprehensive dataset.

Fali Wang, Chenglin Weng, Xianren Zhang, Siyuan Hong, Hui Liu, Suhang Wang2026-03-10🤖 cs.LG

Reward Under Attack: Analyzing the Robustness and Hackability of Process Reward Models

This paper reveals that state-of-the-art Process Reward Models (PRMs) are systematically exploitable by adversarial optimization, functioning primarily as fluency detectors rather than reasoning verifiers due to a critical dissociation between stylistic changes and ground-truth accuracy, prompting the release of a diagnostic framework and benchmark to address these vulnerabilities.

Rishabh Tiwari, Aditya Tomar, Udbhav Bamba, Monishwaran Maheswaran, Heng Yang, Michael W. Mahoney, Kurt Keutzer, Amir Gholami2026-03-10🤖 cs.LG

From ARIMA to Attention: Power Load Forecasting Using Temporal Deep Learning

This paper empirically demonstrates that a Transformer model utilizing self-attention mechanisms outperforms traditional ARIMA and recurrent neural network approaches (LSTM, BiLSTM) in short-term power load forecasting on PJM data, achieving a superior 3.8% MAPE and highlighting the effectiveness of attention-based architectures for capturing complex temporal patterns.

Suhasnadh Reddy Veluru, Sai Teja Erukude, Viswa Chaitanya Marella2026-03-10🤖 cs.LG

Advances in GRPO for Generation Models: A Survey

This survey comprehensively reviews the methodological advances and diverse applications of Flow-GRPO, a framework that extends Group Relative Policy Optimization to align large-scale flow matching models with human preferences across various generative tasks and modalities.

Zexiang Liu, Xianglong He, Yangguang Li2026-03-10🤖 cs.LG

Exploration Space Theory: Formal Foundations for Prerequisite-Aware Location-Based Recommendation

This paper introduces Exploration Space Theory (EST), a formal lattice-theoretic framework that adapts Knowledge Space Theory to location-based recommendation by modeling prerequisite dependencies among points of interest, thereby providing structural guarantees for validity, optimality, and explainability in the Exploration Space Recommender System (ESRS).

Madjid Sadallah2026-03-10🤖 cs.LG

Pavement Missing Condition Data Imputation through Collective Learning-Based Graph Neural Networks

This paper proposes a collective learning-based Graph Convolutional Network model that effectively imputes missing pavement condition data by integrating features from adjacent road sections and capturing dependencies between observed conditions, demonstrating promising results in a Texas Department of Transportation case study.

Ke Yu, Lu Gao2026-03-10🤖 cs.LG

Grouter: Decoupling Routing from Representation for Accelerated MoE Training

Grouter is a preemptive routing framework that decouples structural optimization from weight updates by distilling high-quality routing policies from fully trained models, thereby significantly accelerating Mixture-of-Experts (MoE) training convergence and throughput while improving data utilization.

Yuqi Xu, Rizhen Hu, Zihan Liu, Mou Sun, Kun Yuan2026-03-10🤖 cs.LG

T-REX: Transformer-Based Category Sequence Generation for Grocery Basket Recommendation

The paper proposes T-REX, a novel transformer-based architecture that addresses the unique challenges of online grocery shopping by generating personalized category-level basket recommendations through dynamic sequence splitting, adaptive positional encoding, and causal masking to effectively capture both short-term dependencies and long-term user preferences.

Soroush Mokhtari, Muhammad Tayyab Asif, Sergiy Zubatiy2026-03-10🤖 cs.LG

Leakage Safe Graph Features for Interpretable Fraud Detection in Temporal Transaction Networks

This paper proposes a leakage-safe, time-respecting graph feature extraction protocol for temporal transaction networks that, when combined with transaction attributes, significantly enhances the interpretability and performance of illicit entity classification while preventing look-ahead bias.

Hamideh Khaleghpour, Brett McKinney2026-03-10🤖 cs.LG

A new Uncertainty Principle in Machine Learning

This paper proposes a new "Uncertainty Principle" in machine learning, asserting that the sharpness of a minimum in polynomial-based problems is inversely related to the smoothness of the optimization landscape, a phenomenon caused by the degeneracy of Heaviside and sigmoid expansions that traps gradient descent and necessitates a physics-based rather than purely computational approach to solving these scientific problems.

V. Dolotin, A. Morozov2026-03-10🤖 cs.LG

Graph Property Inference in Small Language Models: Effects of Representation and Inference Strategy

This paper demonstrates that the ability of small language models to infer graph properties depends critically on how relational information is represented and the reasoning strategy employed, rather than solely on model scale.

Michal Podstawski2026-03-10🤖 cs.LG

SmartBench: Evaluating LLMs in Smart Homes with Anomalous Device States and Behavioral Contexts

This paper introduces SmartBench, the first dataset designed to evaluate LLMs on detecting anomalous device states and behavioral contexts in smart homes, revealing that current state-of-the-art models struggle significantly with this critical task.

Qingsong Zou, Zhi Yan, Zhiyao Xu, Kuofeng Gao, Jingyu Xiao, Yong Jiang2026-03-10🤖 cs.LG

← Previous Next →