cs.LG papers | Gist.Science

Concept Drift Guided LayerNorm Tuning for Efficient Multimodal Metaphor Identification

This paper introduces CDGLT, a training-efficient framework for multimodal metaphor identification that leverages Concept Drift via Spherical Linear Interpolation and adapted LayerNorm tuning to achieve state-of-the-art performance on the MET-Meme benchmark while significantly reducing computational costs compared to existing generative methods.

Wenhao Qian, Zhenzhen Hu, Zijie Song, Jia Li2026-03-11🤖 cs.LG

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

This paper introduces Stepwise Guided Policy Optimization (SGPO), a framework that enhances Group Relative Policy Optimization (GRPO) by utilizing a step-wise judge model to provide learning signals from all-negative sample groups, thereby enabling large language models to learn from incorrect reasoning and improving performance across various reasoning benchmarks.

Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin2026-03-11🤖 cs.AI

The Gaussian-Multinoulli Restricted Boltzmann Machine: A Potts Model Extension of the GRBM

This paper introduces the Gaussian-Multinoulli Restricted Boltzmann Machine (GM-RBM), a generative model that extends the standard GB-RBM by employing q-state Potts hidden units to better capture discrete, structured representations, demonstrating competitive performance on analogical recall and memory benchmarks while offering a scalable alternative to binary latent models.

Nikhil Kapasi, Mohamed Elfouly, William Whitehead, Luke Theogarajan2026-03-11🤖 cs.LG

JULI: Jailbreak Large Language Models by Self-Introspection

The paper introduces JULI, a black-box jailbreaking technique that manipulates top-5 token log probabilities via a lightweight plug-in called BiasNet to effectively bypass safety alignment in API-accessible Large Language Models without requiring access to model weights or the generation process.

Jesson Wang, Zhanhao Hu, David Wagner2026-03-11🤖 cs.LG

Discovering Symbolic Differential Equations with Symmetry Invariants

This paper introduces a novel framework for discovering symbolic differential equations from data by utilizing symmetry invariants as atomic building blocks, thereby ensuring that the resulting equations inherently respect physical laws while improving the accuracy and efficiency of existing discovery methods.

Jianke Yang, Manu Bhat, Bryan Hu, Yadi Cao, Nima Dehmamy, Robin Walters, Rose Yu2026-03-11🤖 cs.LG

UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models

The paper introduces UltraEdit, a training-, subject-, and memory-free approach for lifelong language model editing that achieves unprecedented scalability and efficiency by computing parameter shifts in a single step, enabling 7B models to be edited on consumer GPUs with over 2 million updates while outperforming existing methods in speed, memory usage, and accuracy.

Xiaojie Gu, Ziying Huang, Jia-Chen Gu, Kai Zhang2026-03-11🤖 cs.AI

A Systematic Evaluation of On-Device LLMs: Quantization, Performance, and Resources

This paper presents a systematic evaluation of on-device Large Language Models across various sizes and quantization methods, revealing that heavily quantized larger models outperform smaller high-precision ones beyond a 3.5 bits-per-weight threshold while identifying a shift from communication to computational constraints as model size decreases.

Qingyu Song, Rui Liu, Wei Lin, Peiyu Liao, Wenqian Zhao, Yiwen Wang, Shoubo Hu, Yining Jiang, Mochun Long, Hui-Ling Zhen, Ning Jiang, Mingxuan Yuan, Qiao Xiang, Hong Xu2026-03-11🤖 cs.LG

SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning

The paper introduces Saturn, a reinforcement learning framework that leverages Boolean Satisfiability (SAT) problems to overcome scalability, verifiability, and difficulty control limitations in training large language models, resulting in significant reasoning improvements across SAT, math, and programming benchmarks.

Huanyu Liu, Ge Li, Jia Li, Hao Zhu, Kechi Zhang, Yihong Dong2026-03-11🤖 cs.AI

FrontierCO: Real-World and Large-Scale Evaluation of Machine Learning Solvers for Combinatorial Optimization

The paper introduces FrontierCO, a large-scale benchmark utilizing real-world and competition-grade datasets across eight combinatorial optimization problems to rigorously evaluate ML solvers against classical methods, revealing a persistent performance gap on extreme-scale instances while identifying specific scenarios where ML approaches excel.

Shengyu Feng, Weiwei Sun, Shanda Li, Ameet Talwalkar, Yiming Yang2026-03-11🤖 cs.LG

Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review

This paper presents the first systematic review of integrating foundation models into mobile service robotics, analyzing how these technologies address core challenges in perception and control, enabling applications in domestic and healthcare settings while discussing ethical implications and outlining future directions for safe, scalable, and trustworthy deployment.

Matthew Lisondra, Beno Benhabib, Goldie Nejat2026-03-11💬 cs.CL

Semi-Supervised Conformal Prediction With Unlabeled Nonconformity Score

This paper proposes SemiCP, a semi-supervised conformal prediction framework that utilizes an unlabeled nonconformity score based on Nearest Neighbor Matching to leverage unlabeled data for calibration, thereby significantly reducing coverage gaps and improving stability in scenarios with limited labeled data.

Xuanning Zhou, Zihao Shi, Hao Zeng, Xiaobo Xia, Bingyi Jing, Hongxin Wei2026-03-11🤖 cs.LG

Pure Exploration with Infinite Answers

This paper addresses pure exploration problems with potentially infinite answer sets by deriving an instance-dependent lower bound, demonstrating the limitations of existing methods like Sticky Track-and-Stop, and proposing a novel, asymptotically optimal framework called Sticky-Sequence Track-and-Stop.

Riccardo Poiani, Martino Bernasconi, Andrea Celli2026-03-11🤖 cs.LG

Rating Quality of Diverse Time Series Data by Meta-learning from LLM Judgment

The paper proposes TSRating, a novel meta-learning framework that leverages LLMs to generate quality comparisons across diverse domains and trains an efficient TSRater model to accurately and adaptively evaluate time series data quality without requiring extensive hypergradient computations.

Shunyu Wu, Dan Li, Wenjie Feng, Haozheng Ye, Jian Lou, See-Kiong Ng2026-03-11🤖 cs.AI

Cooperative Game-Theoretic Credit Assignment for Multi-Agent Policy Gradients via the Core

This paper proposes CORA, a cooperative game-theoretic credit assignment method that utilizes core allocation and coalition sampling to effectively distribute global advantages among agents in multi-agent reinforcement learning, thereby overcoming the limitations of uniform sharing and enhancing coordinated optimal behavior.

Mengda Ji, Genjiu Xu, Keke Jia, Zekun Duan, Yong Qiu, Jianjun Ge, Mingqiang Li2026-03-11🤖 cs.AI

Regret-Optimal Q-Learning with Low Cost for Single-Agent and Federated Reinforcement Learning

This paper introduces two novel model-free algorithms, Q-EarlySettled-LowCost and FedQ-EarlySettled-LowCost, for single-agent and federated reinforcement learning that simultaneously achieve near-optimal regret, linear burn-in costs in state and action spaces, and logarithmic policy switching or communication costs, while also providing improved gap-dependent theoretical guarantees.

Haochen Zhang, Zhong Zheng, Lingzhou Xue2026-03-11🤖 cs.LG

Towards Robust Real-World Multivariate Time Series Forecasting: A Unified Framework for Dependency, Asynchrony, and Missingness

The paper introduces ChannelTokenFormer, a unified Transformer-based framework that simultaneously addresses the challenges of complex inter-channel dependencies, asynchronous sampling, and missing values to achieve robust real-world multivariate time series forecasting.

Jinkwan Jang, Hyungjin Park, Jinmyeong Choi, Taesup Kim2026-03-11🤖 cs.AI

Uncovering Social Network Activity Using Joint User and Topic Interaction

This paper introduces the Mixture of Interacting Cascades (MIC), a joint user-topic interaction model based on marked multidimensional Hawkes processes that outperforms existing methods in modeling information spread and provides insightful visualizations of social network activity.

Gaspard Abel, Argyris Kalogeratos, Jean-Pierre Nadal, Julien Randon-Furling2026-03-11🤖 cs.LG

ConLID: Supervised Contrastive Learning for Low-Resource Language Identification

The paper proposes ConLID, a supervised contrastive learning approach that learns domain-invariant representations to significantly improve language identification performance for low-resource languages on out-of-domain data while maintaining accuracy for high-resource languages.

Negar Foroutan, Jakhongir Saydaliev, Ye Eun Kim, Antoine Bosselut2026-03-11🤖 cs.AI

Global Convergence of Iteratively Reweighted Least Squares for Robust Subspace Recovery

This paper establishes the first global linear convergence guarantees for a dynamic smoothing variant of Iteratively Reweighted Least Squares (IRLS) in robust subspace and affine subspace recovery, extending these theoretical results to nonconvex optimization on Riemannian manifolds and demonstrating their practical utility in low-dimensional neural network training.

Gilad Lerman, Kang Li, Tyler Maunu, Teng Zhang2026-03-11🤖 cs.LG

Convergence Rate for the Last Iterate of Stochastic Gradient Descent Schemes

This paper establishes new convergence rates for the last iterate of stochastic gradient descent and stochastic heavy ball methods in parametric settings with globally convex or non-convex objectives having $\gamma$ -Hölder gradients, utilizing discrete Gronwall's inequality to derive improved bounds without relying on the Robbins-Siegmund theorem.

Marcel Hudiani2026-03-11🤖 cs.LG

← Previous Next →