cs.LG papers | Gist.Science

Astromer 2

This paper introduces Astromer 2, an enhanced foundational model for light curve analysis that leverages self-supervised pre-training on 1.5 million MACHO survey observations and weighted per-sample embeddings to significantly outperform its predecessor and prior models in classification tasks, particularly when trained with limited labeled data.

Cristobal Donoso-Oliva, Ignacio Becker, Pavlos Protopapas + 3 more2026-03-11🔭 astro-ph

When Machine Learning Gets Personal: Evaluating Prediction and Explanation

This paper proposes a unified framework to evaluate how personalization affects both prediction accuracy and explanation clarity in high-stakes domains, deriving statistical bounds to determine when such effects are detectable given specific dataset characteristics.

Louisa Cornelis, Guillermo Bernárdez, Haewon Jeong, Nina Miolane2026-03-11🤖 cs.LG

On the Impact of the Utility in Semivalue-based Data Valuation

This paper addresses the sensitivity of semivalue-based data valuation to utility selection by introducing a "spatial signature" framework that embeds data points into a lower-dimensional space, enabling a practical methodology to quantify and ensure the robustness of valuation results against utility variations.

Mélissa Tamine, Benjamin Heymann, Maxime Vono, Patrick Loiseau2026-03-11🤖 cs.AI

A Distributional Treatment of Real2Sim2Real for Object-Centric Agent Adaptation in Vision-Driven Deformable Linear Object Manipulation

This paper presents an end-to-end Real2Sim2Real framework for deformable linear object manipulation that employs likelihood-free inference to estimate physical parameter distributions for domain-randomized reinforcement learning, enabling zero-shot deployment of visuomotor policies from simulation to the real world.

Georgios Kamaras, Subramanian Ramamoorthy2026-03-11🤖 cs.LG

Improving clustering quality evaluation in noisy Gaussian mixtures

This paper introduces Feature Importance Rescaling (FIR), a theoretically grounded method that improves the reliability of cluster validity indices in noisy, high-dimensional Gaussian mixtures by attenuating irrelevant features, thereby strengthening the correlation between unsupervised evaluation metrics and ground truth.

Renato Cordeiro de Amorim, Vladimir Makarenkov2026-03-11🤖 cs.LG

Functional Unit: A New Perspective on Materials Science Research Paradigms

This perspective proposes the concept of "functional units" as a critical bridge to reconcile traditional structure-property correlations with emerging data-driven AI paradigms, thereby advancing the understanding of material design and knowledge inheritance across diverse systems.

Caichao Ye, Tao Feng, Weishu Liu + 1 more2026-03-11🔬 cond-mat.mtrl-sci

HyConEx: Hypernetwork classifier with counterfactual explanations for tabular data

The paper introduces HyConEx, a novel deep hypernetwork-based classifier for tabular data that uniquely integrates prediction and explanation by simultaneously generating class labels and local counterfactual examples to interpret model decisions.

Patryk Marszałek, Kamil Ksi\k{a}\.zek, Oleksii Furman, Ulvi Movsum-zada, Przemysław Spurek, Marek Smieja2026-03-11🤖 cs.AI

Experiments with Optimal Model Trees

This paper presents mixed-integer linear programming formulations to construct globally optimal model trees with linear support vector machines at the leaves, demonstrating through extensive experiments that they achieve competitive accuracy with significantly smaller, more interpretable structures compared to greedy model trees and other standard machine learning algorithms.

Sabino Francesco Roselli, Eibe Frank2026-03-11🤖 cs.LG

A Consequentialist Critique of Binary Classification Evaluation: Theory, Practice, and Tools

This paper critiques the prevalent reliance on fixed-threshold metrics in machine learning evaluation by advocating for a consequentialist framework that prioritizes proper scoring rules like the Brier score, supported by a new decision-theoretic mapping, a practical Python package called `briertools`, and a clipped Brier score variant to bridge the gap between theoretical utility and current practices.

Gerardo Flores, Abigail Schiff, Alyssa H. Smith, Julia A Fukuyama, Ashia C. Wilson2026-03-11🤖 cs.AI

Concept Drift Guided LayerNorm Tuning for Efficient Multimodal Metaphor Identification

This paper introduces CDGLT, a training-efficient framework for multimodal metaphor identification that leverages Concept Drift via Spherical Linear Interpolation and adapted LayerNorm tuning to achieve state-of-the-art performance on the MET-Meme benchmark while significantly reducing computational costs compared to existing generative methods.

Wenhao Qian, Zhenzhen Hu, Zijie Song, Jia Li2026-03-11🤖 cs.LG

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

This paper introduces Stepwise Guided Policy Optimization (SGPO), a framework that enhances Group Relative Policy Optimization (GRPO) by utilizing a step-wise judge model to provide learning signals from all-negative sample groups, thereby enabling large language models to learn from incorrect reasoning and improving performance across various reasoning benchmarks.

Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin2026-03-11🤖 cs.AI

The Gaussian-Multinoulli Restricted Boltzmann Machine: A Potts Model Extension of the GRBM

This paper introduces the Gaussian-Multinoulli Restricted Boltzmann Machine (GM-RBM), a generative model that extends the standard GB-RBM by employing q-state Potts hidden units to better capture discrete, structured representations, demonstrating competitive performance on analogical recall and memory benchmarks while offering a scalable alternative to binary latent models.

Nikhil Kapasi, Mohamed Elfouly, William Whitehead, Luke Theogarajan2026-03-11🤖 cs.LG

JULI: Jailbreak Large Language Models by Self-Introspection

The paper introduces JULI, a black-box jailbreaking technique that manipulates top-5 token log probabilities via a lightweight plug-in called BiasNet to effectively bypass safety alignment in API-accessible Large Language Models without requiring access to model weights or the generation process.

Jesson Wang, Zhanhao Hu, David Wagner2026-03-11🤖 cs.LG

Discovering Symbolic Differential Equations with Symmetry Invariants

This paper introduces a novel framework for discovering symbolic differential equations from data by utilizing symmetry invariants as atomic building blocks, thereby ensuring that the resulting equations inherently respect physical laws while improving the accuracy and efficiency of existing discovery methods.

Jianke Yang, Manu Bhat, Bryan Hu, Yadi Cao, Nima Dehmamy, Robin Walters, Rose Yu2026-03-11🤖 cs.LG

UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models

The paper introduces UltraEdit, a training-, subject-, and memory-free approach for lifelong language model editing that achieves unprecedented scalability and efficiency by computing parameter shifts in a single step, enabling 7B models to be edited on consumer GPUs with over 2 million updates while outperforming existing methods in speed, memory usage, and accuracy.

Xiaojie Gu, Ziying Huang, Jia-Chen Gu, Kai Zhang2026-03-11🤖 cs.AI

A Systematic Evaluation of On-Device LLMs: Quantization, Performance, and Resources

This paper presents a systematic evaluation of on-device Large Language Models across various sizes and quantization methods, revealing that heavily quantized larger models outperform smaller high-precision ones beyond a 3.5 bits-per-weight threshold while identifying a shift from communication to computational constraints as model size decreases.

Qingyu Song, Rui Liu, Wei Lin, Peiyu Liao, Wenqian Zhao, Yiwen Wang, Shoubo Hu, Yining Jiang, Mochun Long, Hui-Ling Zhen, Ning Jiang, Mingxuan Yuan, Qiao Xiang, Hong Xu2026-03-11🤖 cs.LG

SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning

The paper introduces Saturn, a reinforcement learning framework that leverages Boolean Satisfiability (SAT) problems to overcome scalability, verifiability, and difficulty control limitations in training large language models, resulting in significant reasoning improvements across SAT, math, and programming benchmarks.

Huanyu Liu, Ge Li, Jia Li, Hao Zhu, Kechi Zhang, Yihong Dong2026-03-11🤖 cs.AI

FrontierCO: Real-World and Large-Scale Evaluation of Machine Learning Solvers for Combinatorial Optimization

The paper introduces FrontierCO, a large-scale benchmark utilizing real-world and competition-grade datasets across eight combinatorial optimization problems to rigorously evaluate ML solvers against classical methods, revealing a persistent performance gap on extreme-scale instances while identifying specific scenarios where ML approaches excel.

Shengyu Feng, Weiwei Sun, Shanda Li, Ameet Talwalkar, Yiming Yang2026-03-11🤖 cs.LG

Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review

This paper presents the first systematic review of integrating foundation models into mobile service robotics, analyzing how these technologies address core challenges in perception and control, enabling applications in domestic and healthcare settings while discussing ethical implications and outlining future directions for safe, scalable, and trustworthy deployment.

Matthew Lisondra, Beno Benhabib, Goldie Nejat2026-03-11💬 cs.CL

Semi-Supervised Conformal Prediction With Unlabeled Nonconformity Score

This paper proposes SemiCP, a semi-supervised conformal prediction framework that utilizes an unlabeled nonconformity score based on Nearest Neighbor Matching to leverage unlabeled data for calibration, thereby significantly reducing coverage gaps and improving stability in scenarios with limited labeled data.

Xuanning Zhou, Zihao Shi, Hao Zeng, Xiaobo Xia, Bingyi Jing, Hongxin Wei2026-03-11🤖 cs.LG

← Previous Next →