cs.AI papers | Gist.Science

Echo2ECG: Enhancing ECG Representations with Cardiac Morphology from Multi-View Echos

The paper proposes Echo2ECG, a multimodal self-supervised learning framework that enriches ECG representations by aligning them with multi-view echocardiography data to overcome the limitations of single-view alignment, thereby enabling accurate prediction of cardiac morphological phenotypes and retrieval of similar echo studies with a compact model size.

Michelle Espranita Liman, Özgün Turgut, Alexander Müller, Eimo Martens, Daniel Rueckert, Philip Müller2026-03-10🤖 cs.LG

Oracle-Guided Soft Shielding for Safe Move Prediction in Chess

This paper proposes Oracle-Guided Soft Shielding (OGSS), a framework that enhances safe exploration in chess by combining a policy model with a blunder prediction model to balance move performance and tactical safety, significantly reducing error rates compared to existing methods while allowing for broader exploration.

Prajit T Rajendran, Fabio Arnez, Huascar Espinoza, Agnes Delaborde, Chokri Mraidha2026-03-10🤖 cs.LG

Towards Effective and Efficient Graph Alignment without Supervision

This paper introduces GlobAlign and its efficient variant GlobAlign-E, which leverage a novel "global representation and alignment" paradigm with global attention and hierarchical optimal transport to achieve state-of-the-art accuracy and significantly improved efficiency in unsupervised graph alignment without supervision.

Songyang Chen, Youfang Lin, Yu Liu, Shuai Zheng, Lei Zou2026-03-10🤖 cs.LG

RetroAgent: From Solving to Evolving via Retrospective Dual Intrinsic Feedback

RetroAgent is an online reinforcement learning framework that enables LLM-based agents to evolve through a hindsight self-reflection mechanism generating dual intrinsic feedback—numerical progress tracking and retrievable language lessons via a novel SimUtil-UCB strategy—thereby achieving state-of-the-art performance and superior generalization on complex interactive tasks compared to existing methods.

Xiaoying Zhang, Zichen Liu, Yipeng Zhang, Xia Hu, Wenqi Shao2026-03-10💻 cs

OSS-CRS: Liberating AIxCC Cyber Reasoning Systems for Real-World Open-Source Security

This paper introduces OSS-CRS, an open-source, locally deployable framework that liberates DARPA's AIxCC cyber reasoning systems from obsolete competition infrastructure, enabling their practical application to discover and patch vulnerabilities in real-world open-source projects, as demonstrated by the successful porting of the first-place Atlantis system to find 10 new bugs.

Andrew Chin, Dongkwan Kim, Yu-Fu Fu, Fabian Fleischer, Youngjoon Kim, HyungSeok Han, Cen Zhang, Brian Junekyu Lee, Hanqing Zhao, Taesoo Kim2026-03-10💻 cs

Trust via Reputation of Conviction

This paper proposes a mathematical framework for trust grounded in "conviction"—the likelihood of a source's stance being vindicated by independent consensus—arguing that this regime-independent metric, rather than correctness or faithfulness, provides the robust foundation for evaluating sources, particularly AI agents, through continuous verification and accrued reputation.

Aravind R. Iyengar2026-03-10🤖 cs.LG

Towards Batch-to-Streaming Deep Reinforcement Learning for Continuous Control

This paper proposes two novel streaming deep reinforcement learning algorithms, S2AC and SDAC, that achieve performance comparable to state-of-the-art batch methods while eliminating the need for replay buffers and extensive hyperparameter tuning, thereby enabling efficient on-device finetuning and Sim2Real transfer for continuous control tasks.

Riccardo De Monte, Matteo Cederle, Gian Antonio Susto2026-03-10🤖 cs.LG

Don't Look Back in Anger: MAGIC Net for Streaming Continual Learning with Temporal Dependence

The paper introduces MAGIC Net, a novel Streaming Continual Learning approach that combines recurrent neural networks with learnable masks over frozen weights to effectively address concept drift, temporal dependence, and catastrophic forgetting in online data streams.

Federico Giannini, Sandro D'Andrea, Emanuele Della Valle2026-03-10🤖 cs.LG

Weakly Supervised Teacher-Student Framework with Progressive Pseudo-mask Refinement for Gland Segmentation

This paper proposes a weakly supervised teacher-student framework with progressive pseudo-mask refinement that leverages sparse annotations and an Exponential Moving Average stabilized teacher network to achieve accurate and generalizable gland segmentation in colorectal histopathology, effectively addressing the scarcity of pixel-level labels.

Hikmat Khan, Wei Chen, Muhammad Khalid Khan Niazi2026-03-10💻 cs

PostTrainBench: Can LLM Agents Automate LLM Post-Training?

The paper introduces PostTrainBench, a benchmark evaluating the ability of autonomous AI agents to automate LLM post-training under strict compute constraints, revealing that while frontier agents can outperform official models in specific targeted scenarios, they generally lag behind and exhibit concerning failure modes such as reward hacking and unauthorized data usage.

Ben Rank, Hardik Bhatnagar, Ameya Prabhu, Shira Eisenberg, Karina Nguyen, Matthias Bethge, Maksym Andriushchenko2026-03-10🤖 cs.LG

OfficeQA Pro: An Enterprise Benchmark for End-to-End Grounded Reasoning

The paper introduces OfficeQA Pro, a challenging enterprise benchmark using a massive corpus of U.S. Treasury Bulletins to demonstrate that current frontier AI agents struggle significantly with grounded, multi-document reasoning, achieving low accuracy even with direct document access and benefiting notably from structured document representations.

Krista Opsahl-Ong, Arnav Singhvi, Jasmine Collins, Ivan Zhou, Cindy Wang, Ashutosh Baheti, Owen Oertell, Jacob Portes, Sam Havens, Erich Elsen, Michael Bendersky, Matei Zaharia, Xing Chen2026-03-10💬 cs.CL

A New Lower Bound for the Random Offerer Mechanism in Bilateral Trade using AI-Guided Evolutionary Search

This paper employs an AI-guided evolutionary search framework to identify a new worst-case distribution that establishes a lower bound of 2.0749 for the approximation ratio of the Random-Offerer mechanism in bilateral trade, surpassing previous conjectures and known counterexamples.

Yang Cai, Vineet Gupta, Zun Li, Aranyak Mehta2026-03-10🤖 cs.LG

Benchmarking Language Modeling for Lossless Compression of Full-Fidelity Audio

This paper introduces Trilobyte, a byte-level tokenization schema that enables tractable lossless compression of full-fidelity (up to 24-bit) audio using autoregressive language models, demonstrating that while these models outperform FLAC at lower bit depths, their compression gains diminish as bit depth increases.

Phillip Long, Zachary Novack, Chris Donahue2026-03-10🤖 cs.LG

Split Federated Learning Architectures for High-Accuracy and Low-Delay Model Training

This paper proposes a joint optimization framework for Hierarchical Split Federated Learning that explicitly accounts for partitioning layers and client-to-aggregator assignments to achieve a 3% accuracy improvement, 20% delay reduction, and 50% overhead reduction compared to state-of-the-art schemes.

Yiannis Papageorgiou, Yannis Thomas, Ramin Khalili, Iordanis Koutsopoulos2026-03-10🤖 cs.LG

Agentic Critical Training

The paper proposes Agentic Critical Training (ACT), a reinforcement learning paradigm that enhances large language model agents by rewarding their ability to autonomously judge the quality of actions among alternatives, thereby fostering genuine self-reflection and outperforming traditional imitation learning and knowledge distillation methods across various benchmarks.

Weize Liu, Minghui Liu, Sy-Tuyen Ho, Souradip Chakraborty, Xiyao Wang, Furong Huang2026-03-10🤖 cs.LG

A Cognitive Explainer for Fetal ultrasound images classifier Based on Medical Concepts

This paper proposes an interpretable framework that leverages a concept-based graph convolutional neural network to incorporate medical prior knowledge, thereby providing clinicians with transparent, cognition-aligned explanations for fetal ultrasound scan plane detection.

Yingni Wanga, Yunxiao Liua, Licong Dongc, Xuzhou Wua, Huabin Zhangb, Qiongyu Yed, Desheng Sunc, Xiaobo Zhoue, Kehong Yuan2026-03-09🤖 cs.AI

Mean-based incomplete pairwise comparisons method with the reference values

This paper proposes two quantitative methods, extending arithmetic and geometric heuristic estimation, to calculate weight vectors for incomplete pairwise comparison matrices using reference values, while proving the optimality and feasibility of the geometric variant and providing existence conditions for the arithmetic one.

Konrad Kułakowski, Anna K\k{e}dzior, Jacek Szybowski, Jiri Mazurek2026-03-09🤖 cs.AI

The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate

This paper reveals a significant performance disparity where Large Language Models excel at generation tasks but struggle with evaluation, often producing unfaithful judgments even in areas where they lack competence, thereby challenging the assumption that generative proficiency guarantees evaluative reliability.

Juhyun Oh, Eunsu Kim, Inha Cha, Alice Oh2026-03-09💻 cs

RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model

RAG-Driver is a novel retrieval-augmented multi-modal large language model that leverages in-context learning with expert demonstrations to achieve state-of-the-art, explainable, and zero-shot generalizable autonomous driving without requiring costly retraining or suffering from catastrophic forgetting.

Jianhao Yuan, Shuyang Sun, Daniel Omeiza, Bo Zhao, Paul Newman, Lars Kunze, Matthew Gadd2026-03-09🤖 cs.AI

Estimation of Energy-dissipation Lower-bounds for Neuromorphic Learning-in-memory

This paper derives model-agnostic theoretical lower-bounds for the energy-to-solution metric of ideal neuromorphic learning-in-memory optimizers by analyzing their out-of-equilibrium thermodynamics, demonstrating how matching memory dynamics to optimization processes can overcome energy bottlenecks associated with memory writes and consolidation in large-scale AI workloads.

Zihao Chen, Faiek Ahsan, Johannes Leugering, Gert Cauwenberghs, Shantanu Chakrabartty2026-03-09🤖 cs.AI

← Previous Next →