cs.LG papers | Gist.Science

ORIC: Benchmarking Object Recognition under Contextual Incongruity in Large Vision-Language Models

This paper introduces the ORIC framework and benchmark to evaluate and improve Large Vision-Language Models' object recognition capabilities under contextual incongruity, demonstrating that such scenarios significantly degrade performance and that targeted Visual Reinforcement Fine-Tuning can effectively mitigate these failures.

Zhaoyang Li, Zhan Ling, Yuchen Zhou, Litian Gong, Erdem Bıyık, Hao SuTue, 10 Ma🤖 cs.LG

ORN-CBF: Learning Observation-conditioned Residual Neural Control Barrier Functions via Hypernetworks

This paper proposes ORN-CBF, a hypernetwork-based learning framework that utilizes Hamilton-Jacobi reachability analysis to generate observation-conditioned neural control barrier functions, ensuring rigorous safety guarantees and improved generalization in partially observable environments through simulation and hardware experiments.

Bojan Derajic, Sebastian Bernhard, Wolfgang HönigTue, 10 Ma🤖 cs.LG

Empirical PAC-Bayes bounds for Markov chains

This paper introduces the first fully empirical PAC-Bayes bound for Markov chains by deriving a data-dependent estimate for the pseudo-spectral gap, thereby eliminating the need for unknown constants related to mixing properties that typically hinder practical generalization guarantees.

Vahe Karagulyan, Pierre AlquierTue, 10 Ma🤖 cs.LG

Linear probes rely on textual evidence: Results from leakage mitigation studies in language models

This paper demonstrates that linear probes used to detect harmful behaviors in language models are heavily reliant on explicit textual evidence, as their performance significantly degrades when such surface-level cues are filtered out or when models are trained to express behaviors without verbalization.

Gerard Boxo, Aman Neelappa, Shivam RavalTue, 10 Ma🤖 cs.LG

AEGIS: Authentic Edge Growth In Sparsity for Link Prediction in Edge-Sparse Bipartite Knowledge Graphs

The paper introduces AEGIS, an edge-only augmentation framework that resamples existing training edges to enhance link prediction in edge-sparse bipartite knowledge graphs, demonstrating that authenticity-constrained resampling preserves data integrity while semantic KNN augmentation further boosts performance when node descriptions are available.

Hugh Xuechen Liu, Kıvanç TatarTue, 10 Ma🤖 cs.LG

GDR-learners: Orthogonal Learning of Generative Models for Potential Outcomes

This paper introduces GDR-learners, a flexible suite of generative models (including CNFs, CGANs, CVAEs, and CDMs) that achieve quasi-oracle efficiency and double robustness for estimating potential outcome distributions, thereby outperforming existing methods in both theoretical properties and empirical performance.

Valentyn Melnychuk, Stefan FeuerriegelTue, 10 Ma🤖 cs.LG

CLAD-Net: Continual Activity Recognition in Multi-Sensor Wearable Systems

CLAD-Net is a continual learning framework for wearable human activity recognition that combines a self-supervised transformer for long-term memory and a supervised CNN with knowledge distillation to effectively mitigate catastrophic forgetting and handle label scarcity across diverse subjects.

Reza Rahimi Azghan, Gautham Krishna Gudur, Mohit Malu, Edison Thomaz, Giulia Pedrielli, Pavan Turaga, Hassan GhasemzadehTue, 10 Ma🤖 cs.LG

Generative Evolutionary Meta-Solver (GEMS): Scalable Surrogate-Free Multi-Agent Reinforcement Learning

The paper introduces Generative Evolutionary Meta-Solver (GEMS), a scalable, surrogate-free multi-agent reinforcement learning framework that replaces explicit policy populations with a compact generator and latent anchors to achieve significantly faster training, lower memory usage, and higher rewards than traditional methods like PSRO while maintaining game-theoretic guarantees.

Alakh Sharma, Gaurish Trivedi, Kartikey Singh Bhandari, Yash Sinha, Dhruv Kumar, Pratik Narang, Jagat Sesh ChallaTue, 10 Ma🤖 cs.LG

FS-KAN: Permutation Equivariant Kolmogorov-Arnold Networks via Function Sharing

This paper introduces FS-KAN, a principled framework that constructs permutation equivariant and invariant Kolmogorov-Arnold Networks via function sharing, offering superior data efficiency and interpretability while maintaining the expressive power of standard parameter-sharing networks.

Ran Elbaz, Guy Bar-Shalom, Yam Eitan, Fabrizio Frasca, Haggai MaronTue, 10 Ma🤖 cs.LG

Overlap-Adaptive Regularization for Conditional Average Treatment Effect Estimation

This paper introduces Overlap-Adaptive Regularization (OAR), a novel method that enhances the performance of existing CATE meta-learners in low-overlap regions by proportionally increasing regularization based on overlap weights, while offering flexible, debiased variants that preserve Neyman-orthogonality for robust inference.

Valentyn Melnychuk, Dennis Frauen, Jonas Schweisthal, Stefan FeuerriegelTue, 10 Ma🤖 cs.LG

Cold-Start Active Correlation Clustering

This paper addresses the cold-start scenario in active correlation clustering, where no initial pairwise similarities are available, by proposing a coverage-aware method that encourages diversity to efficiently query similarities and achieve effective clustering.

Linus Aronsson, Han Wu, Morteza Haghir ChehreghaniTue, 10 Ma🤖 cs.LG

Feedback Control for Small Budget Pacing

This paper proposes a principled feedback control method combining bucketized hysteresis and proportional feedback to achieve stable, accurate, and scalable budget pacing for online advertising campaigns, particularly improving performance for small-budget scenarios by significantly reducing pacing errors and volatility compared to existing baselines.

Sreeja Apparaju, Yichuan Niu, Xixi QiTue, 10 Ma🤖 cs.LG

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents

This paper introduces and empirically validates the concept of "misevolution," demonstrating that self-evolving LLM agents face widespread, emergent risks across model, memory, tool, and workflow pathways that can lead to safety degradation and unintended vulnerabilities, thereby highlighting an urgent need for new safety paradigms.

Shuai Shao, Qihan Ren, Chen Qian, Boyi Wei, Dadi Guo, Jingyi Yang, Xinhao Song, Linfeng Zhang, Weinan Zhang, Dongrui Liu, Jing ShaoTue, 10 Ma🤖 cs.LG

An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes

This paper introduces the DRQ-learner, a novel meta-learner for estimating individualized potential outcomes in Markov Decision Processes that combines double robustness, Neyman orthogonality, and quasi-oracle efficiency to outperform existing state-of-the-art methods in sequential decision-making.

Emil Javurek, Valentyn Melnychuk, Jonas Schweisthal, Konstantin Hess, Dennis Frauen, Stefan FeuerriegelTue, 10 Ma🤖 cs.LG

Privately Estimating Black-Box Statistics

This paper presents a differentially private scheme for estimating black-box statistics that effectively trades off between statistical efficiency and oracle efficiency, accompanied by lower bounds demonstrating the near-optimality of this approach.

Günter F. Steinke, Thomas SteinkeTue, 10 Ma🤖 cs.LG

Stochastic Self-Organization in Multi-Agent Systems

The paper introduces SelfOrg, a training-free framework that enables multi-agent systems to self-organize by dynamically constructing response-conditioned communication graphs based on Shapley value approximations, thereby optimizing collaboration and significantly improving performance—especially with weaker LLMs—without relying on fixed topologies or external supervision.

Nurbek Tastan, Samuel Horvath, Karthik NandakumarTue, 10 Ma🤖 cs.LG

CroSTAta: Cross-State Transition Attention Transformer for Robotic Manipulation

The paper introduces CroSTAta, a Cross-State Transition Attention Transformer that enhances robotic manipulation robustness by employing a novel State Transition Attention mechanism to model temporal structures like failure and recovery patterns, outperforming standard attention and sequential models in simulation.

Giovanni Minelli, Giulio Turrisi, Victor Barasuol, Claudio SeminiTue, 10 Ma🤖 cs.LG

Double projection for reconstructing dynamical systems: between stochastic and deterministic regimes

This paper introduces a "double projection" method within dynamical variational autoencoders that simultaneously estimates system state trajectories and noise time series from data, enabling effective multi-step reconstruction and learning of low-dimensional stochastic models across various benchmark problems.

Viktor Sip, Martin Breyton, Spase Petkoski, Viktor JirsaTue, 10 Ma🤖 cs.LG

Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks

The paper introduces DialTree, a tree-based dialogue reinforcement learning framework that autonomously discovers diverse and effective multi-turn attack strategies against large language models, significantly outperforming existing single-turn or template-based red-teaming methods.

Ruohao Guo, Afshin Oroojlooy, Roshan Sridhar, Miguel Ballesteros, Alan Ritter, Dan RothTue, 10 Ma🤖 cs.LG

The Role of Feature Interactions in Graph-based Tabular Deep Learning

This paper demonstrates that current graph-based tabular deep learning methods fail to accurately recover underlying feature interaction structures despite their focus on predictive accuracy, and shows that explicitly modeling the true graph structure significantly improves prediction performance.

Elias Dubbeldam, Reza Mohammadi, Marit Schoonhoven, S. Ilker BirbilTue, 10 Ma🤖 cs.LG

← Previous Next →