cs.AI papers | Gist.Science

Empowering Locally Deployable Medical Agent via State Enhanced Logical Skills for FHIR-based Clinical Tasks

This paper introduces SELSM, a training-free framework that enhances locally deployable medical agents by distilling simulated clinical trajectories into entity-agnostic logical rules, thereby significantly improving zero-shot FHIR-based task performance and achieving a 100% completion rate on the MedAgentBench without compromising data privacy.

Wanrong Yang, Zhengliang Liu, Yuan Li, Bingjie Yan, Lingfang Li, Mingguang He, Dominik Wojtczak, Yalin Zheng, Danli Shi2026-03-10💻 cs

MindfulAgents: Personalizing Mindfulness Meditation via an Expert-Aligned Multi-Agent System

MindfulAgents is a large language model-driven multi-agent system that personalizes mindfulness meditation through expert-aligned script generation and real-time adaptation, significantly improving user engagement, self-awareness, and stress reduction in both short-term and long-term studies.

Mengyuan (Millie), Wu, Zhihan Jiang, Yuang Fan, Richard Feng, Sahiti Dharmavaram, Mathew Polowitz, Shawn Fallon, Bashima Islam, Lizbeth Benson, Irene Tung, David Creswell, Xuhai Xu2026-03-10💻 cs

How Private Are DNA Embeddings? Inverting Foundation Model Representations of Genomic Sequences

This study demonstrates that DNA foundation models (DNABERT-2, Evo 2, and NTv2) are vulnerable to model inversion attacks, where adversaries can reconstruct sensitive genomic sequences from shared embeddings with high accuracy, particularly for shorter sequences and per-token representations, thereby highlighting critical privacy risks in Embeddings-as-a-Service frameworks.

Sofiane Ouaari, Jules Kreuer, Nico Pfeifer2026-03-10🤖 cs.LG

Post-Training with Policy Gradients: Optimality and the Base Model Barrier

This paper establishes that while policy gradient methods can achieve near-optimal performance in post-training autoregressive models when staying within the base model's support, they face an exponential query barrier to generalize beyond it unless process rewards are utilized to leverage token-level likelihood quantiles.

Alireza Mousavi-Hosseini, Murat A. Erdogdu2026-03-10🤖 cs.LG

Learning Quadruped Walking from Seconds of Demonstration

This paper presents a principled analysis of why imitation learning is effective for quadruped locomotion with limited data and proposes a new offline method that enables robots to learn robust walking policies from just a few seconds of demonstration.

Ruipeng Zhang, Hongzhan Yu, Ya-Chien Chang, Chenghao Li, Henrik I. Christensen, Sicun Gao2026-03-10🤖 cs.LG

Elenchus: Generating Knowledge Bases from Prover-Skeptic Dialogues

This paper introduces Elenchus, a dialogue system that leverages prover-skeptic interactions between a human expert and an LLM to construct knowledge bases grounded in inferentialist semantics, mapping dialectical states to formal logic to explicitly capture and verify the inferential relationships and design rationales of complex ontologies like PROV-O.

Bradley P. Allen2026-03-10💬 cs.CL

A Systematic Investigation of Document Chunking Strategies and Embedding Sensitivity

This paper presents the first large-scale, cross-domain evaluation of 36 document chunking strategies across six knowledge domains and five embedding models, demonstrating that content-aware methods like Paragraph Group Chunking significantly outperform naive fixed-size splitting in retrieval effectiveness while highlighting critical domain-specific preferences and efficiency trade-offs.

Muhammad Arslan Shaukat, Muntasir Adnan, Carlos C. N. Kuhn2026-03-10💬 cs.CL

NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning

This paper introduces NePPO, a novel multi-agent reinforcement learning pipeline that computes approximate Nash equilibria in general-sum games by learning a player-independent potential function to transform the mixed cooperative-competitive environment into an approximating cooperative game.

Addison Kalanther, Sanika Bharvirkar, Shankar Sastry, Chinmay Maheshwari2026-03-10🤖 cs.LG

Diffusion Controller: Framework, Algorithms and Parameterization

The paper introduces Diffusion Controller (DiffCon), a unified control-theoretic framework that models reverse diffusion sampling as a state-only stochastic control problem within LS-MDPs, enabling the derivation of practical fine-tuning algorithms and a lightweight side-network architecture that outperforms existing gray-box and white-box adaptation methods.

Tong Yang, Moonkyung Ryu, Chih-Wei Hsu, Guy Tennenholtz, Yuejie Chi, Craig Boutilier, Bo Dai2026-03-10🤖 cs.LG

Masked Unfairness: Hiding Causality within Zero ATE

This paper demonstrates that optimizing for objectives like profit or crime reduction while maintaining a zero Average Treatment Effect (ATE) can mask significant unfairness driven by confounding, thereby arguing that fairness regulations must shift from evaluating aggregate decision-level outcomes to scrutinizing model-level causal mechanisms.

Zou Yang, Sophia Xiao, Bijan Mazaheri2026-03-10🤖 cs.LG

Foundational World Models Accurately Detect Bimanual Manipulator Failures

This paper introduces a lightweight, probabilistic world model built on a pretrained vision foundation model that generates uncertainty-based runtime monitors to accurately detect anomalous failures in bimanual manipulators, outperforming existing baselines while requiring significantly fewer trainable parameters.

Isaac R. Ward, Michelle Ho, Houjun Liu, Aaron Feldman, Joseph Vincent, Liam Kruse, Sean Cheong, Duncan Eddy, Mykel J. Kochenderfer, Mac Schwager2026-03-10💻 cs

SuperSkillsStack: Agency, Domain Knowledge, Imagination, and Taste in Human-AI Design Education

This study analyzes how 80 student design teams integrated generative AI into their creative process, revealing that while AI serves as a cognitive accelerator for early-stage tasks like brainstorming, human competencies in agency, domain knowledge, imagination, and taste remain essential for interpreting context, validating outputs, and refining design solutions.

Qian Huang, King Wang Poon2026-03-10💻 cs

Can Safety Emerge from Weak Supervision? A Systematic Analysis of Small Language Models

This paper introduces Self-MOA, a fully automated framework that aligns small language models using weak supervision from automated evaluators to achieve significant safety improvements with minimal training data while preserving helpfulness.

Punyajoy Saha, Sudipta Halder, Debjyoti Mondal, Subhadarshi Panda2026-03-10🤖 cs.LG

RESCHED: Rethinking Flexible Job Shop Scheduling from a Transformer-based Architecture with Simplified States

The paper introduces \textsc{ReSched}, a minimalist deep reinforcement learning framework that simplifies the Flexible Job Shop Scheduling Problem by condensing the state space to four essential features and utilizing a modified Transformer architecture, achieving superior performance and generalization across various scheduling variants compared to existing methods.

Xiangjie Xiao, Cong Zhang, Wen Song, Zhiguang Cao2026-03-10🤖 cs.LG

Hit-RAG: Learning to Reason with Long Contexts via Preference Alignment

Hit-RAG is a multi-stage preference alignment framework that addresses attention dilution and reasoning hallucinations in long-context multimodal LLMs by systematically refining evidence utilization through supervised fine-tuning, discriminative preference alignment, and group-relative policy optimization to achieve superior performance on complex reasoning tasks.

Junming Liu, Yuqi Li, Shiping Wen, Zhigang Zeng, Tingwen Huang2026-03-10💬 cs.CL

Enhancing Web Agents with a Hierarchical Memory Tree

This paper proposes the Hierarchical Memory Tree (HMT), a structured framework that decouples high-level task logic from site-specific action details through a three-level abstraction hierarchy, thereby significantly enhancing the generalization and robustness of large language model-based web agents in unseen environments.

Yunteng Tan, Zhi Gao, Xinxiao Wu2026-03-10💻 cs

Self-Supervised Multi-Modal World Model with 4D Space-Time Embedding

The paper introduces DeepEarth, a self-supervised multi-modal world model featuring Earth4D, a novel 4D space-time positional encoder that achieves state-of-the-art ecological forecasting performance and outperforms larger foundation models through efficient planetary-scale learning.

Lance Legel, Qin Huang, Brandon Voelker, Daniel Neamati, Patrick Alan Johnson, Favyen Bastani, Jeff Rose, James Ryan Hennessy, Robert Guralnick, Douglas Soltis, Pamela Soltis, Shaowen Wang2026-03-10💻 cs

Looking Back and Forth: Cross-Image Attention Calibration and Attentive Preference Learning for Multi-Image Hallucination Mitigation

This paper proposes CAPL, a framework that mitigates multi-image hallucinations in large vision-language models by introducing a selectable image token interaction mechanism for fine-grained cross-image alignment and a preference learning strategy that trains the model to rely on genuine visual evidence rather than textual priors.

Xiaochen Yang, Hao Fang, Jiawei Kong, Yaoxin Mao, Bin Chen, Shu-Tao Xia2026-03-10💻 cs

Animating Petascale Time-varying Data on Commodity Hardware with LLM-assisted Scripting

This paper presents a user-friendly framework that enables domain scientists to generate 3D animations of petascale, time-varying climate data on commodity hardware using an LLM-assisted conversational interface, thereby eliminating the need for specialized visualization expertise and high-performance computing resources.

Ishrat Jahan Eliza, Xuan Huang, Aashish Panta, Alper Sahistan, Zhimin Li, Amy A. Gooch, Valerio Pascucci2026-03-10💻 cs

Bi-directional digital twin prototype anchoring with multi-periodicity learning for few-shot fault diagnosis

This paper proposes a bi-directional digital twin prototype anchoring framework enhanced with multi-periodicity learning to achieve robust few-shot fault diagnosis by leveraging meta-training in a virtual simulation space and test-time adaptation in the physical domain, thereby overcoming the limitations of traditional methods that require abundant labeled or unlabeled target data.

Pengcheng Xia, Zhichao Dong, Yixiang Huang, Chengjin Qin, Qun Chao, Chengliang Liu2026-03-10💻 cs

← Previous Next →