cs.AI papers | Gist.Science

CoRPO: Adding a Correctness Bias to GRPO Improves Generalization

The paper proposes Correctness-Relative Policy Optimization (CoRPO), a modification to Group-Relative Policy Optimization (GRPO) that clips the advantage baseline to a correctness threshold to prevent reinforcing incorrect solutions, thereby significantly improving the model's generalization and cross-domain reasoning capabilities.

Anisha Garg, Claire Zhang, Nishit Neema + 3 more2026-03-06💻 cs

SASG-DA: Sparse-Aware Semantic-Guided Diffusion Augmentation For Myoelectric Gesture Recognition

This paper proposes SASG-DA, a novel diffusion-based data augmentation framework that leverages semantic guidance and sparse-aware sampling to generate faithful and diverse sEMG data, thereby significantly improving the generalization and performance of myoelectric gesture recognition models on benchmark datasets.

Chen Liu, Can Han, Weishi Xu + 2 more2026-03-06💻 cs

DAP: A Discrete-token Autoregressive Planner for Autonomous Driving

DAP is a compact, discrete-token autoregressive planner that jointly forecasts BEV semantics and ego trajectories with reinforcement learning fine-tuning, achieving state-of-the-art performance on autonomous driving benchmarks despite a limited 160M parameter budget.

Bowen Ye, Bin Zhang, Hang Zhao2026-03-06💻 cs

CCSD: Cross-Modal Compositional Self-Distillation for Robust Brain Tumor Segmentation with Missing Modalities

This paper proposes Cross-Modal Compositional Self-Distillation (CCSD), a novel framework utilizing a shared-specific architecture and dual self-distillation strategies to achieve robust, state-of-the-art brain tumor segmentation performance across arbitrary missing MRI modality scenarios.

Dongqing Xie, Yonghuang Wu, Zisheng Ai + 4 more2026-03-06💻 cs

Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach

This paper introduces FlashCache, a frequency-domain-guided KV cache compression framework that identifies and preserves critical "Outlier KVs" while leveraging low-pass filtering and dynamic budget allocation to achieve significant inference speedups and memory reduction in multimodal large language models without compromising performance.

Yaoxin Yang, Peng Ye, Xudong Tan + 4 more2026-03-06💻 cs

MambaTAD: When State-Space Models Meet Long-Range Temporal Action Detection

This paper presents MambaTAD, an end-to-end one-stage temporal action detection model that leverages a Diagonal-Masked Bidirectional State-Space module and a global feature fusion head to overcome the limitations of existing state-space models and traditional methods in detecting long-span action instances with linear computational complexity.

Hui Lu, Yi Yu, Shijian Lu + 4 more2026-03-06💻 cs

CycleChemist: A Dual-Pronged Machine Learning Framework for Organic Photovoltaic Discovery

This paper introduces CycleChemist, a dual-pronged machine learning framework that leverages a new large-scale dataset (OPV2D) to simultaneously predict organic photovoltaic performance and generate synthetically accessible high-efficiency donor-acceptor materials through an integrated predictive and generative approach.

Hou Hei Lam, Jiangjie Qiu, Xiuyuan Hu + 5 more2026-03-06🔬 cond-mat.mtrl-sci

Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning

This paper introduces L4L, a solver-centric framework that enhances the trustworthiness of legal AI by integrating role-differentiated LLM agents with SMT-backed formal verification to ensure logical, auditable, and statute-aligned legal reasoning.

Linze Chen, Yufan Cai, Zhe Hou + 1 more2026-03-06💻 cs

Steering Awareness: Models Can Be Trained to Detect Activation Steering

This paper demonstrates that language models can be fine-tuned to reliably detect and identify activation steering interventions, revealing that such steering is not inherently undetectable and that models trained to recognize it may paradoxically become more susceptible to behavioral manipulation.

Joshua Fonseca Rivera, David Demitri Africa2026-03-06💻 cs

DPAC: Distribution-Preserving Adversarial Control for Diffusion Sampling

This paper introduces DPAC, a diffusion guidance method that projects adversarial gradients onto the tangent space of iso-density surfaces to minimize path-space KL divergence and control energy, thereby theoretically and empirically achieving higher sample quality (lower FID) while maintaining target classification success.

Han-Jin Lee, Han-Ju Lee, Jin-Seong Kim + 1 more2026-03-06💻 cs

Deep FlexQP: Accelerated Nonlinear Programming via Deep Unfolding

The paper proposes Deep FlexQP, a deep unfolding-based solver that accelerates nonlinear programming by learning dimension-agnostic parameters for a robust, always-feasible convex QP relaxation, thereby significantly improving the speed and success rates of SQP and safety filter applications while providing rigorous performance guarantees.

Alex Oshin, Rahul Vodeb Ghosh, Augustinos D. Saravanos + 1 more2026-03-06🔢 math

Guided Flow Policy: Learning from High-Value Actions in Offline Reinforcement Learning

The paper introduces Guided Flow Policy (GFP), a novel offline reinforcement learning method that couples a multi-step flow-matching policy with a distilled one-step actor to selectively focus on high-value actions, achieving state-of-the-art performance across diverse benchmarks by overcoming the limitations of indiscriminate behavior regularization.

Franki Nguimatsia Tiofack, Théotime Le Hellard, Fabian Schramm + 2 more2026-03-06💻 cs

Bootstrapped Mixed Rewards for RL Post-Training: Injecting Canonical Action Order

This paper demonstrates that injecting a canonical action ordering signal into the reward function during RL post-training significantly improves Transformer performance on Zebra puzzles compared to optimizing for task success alone, even when the model is fine-tuned on randomized solution sequences.

Prakhar Gupta, Vaibhav Gupta2026-03-06💻 cs

Multi-Loss Learning for Speech Emotion Recognition with Energy-Adaptive Mixup and Frame-Level Attention

This paper proposes a multi-loss learning framework for speech emotion recognition that integrates energy-adaptive mixup and frame-level attention to address data scarcity and emotional complexity, achieving state-of-the-art performance across four benchmark datasets.

Cong Wang, Yizhong Geng, Yuhua Wen + 7 more2026-03-06💻 cs

Sparse Attention Post-Training for Mechanistic Interpretability

This paper introduces a post-training method that induces extreme sparsity in transformer attention (reducing connectivity to ~0.4%) without sacrificing performance, thereby revealing simplified, interpretable task-specific circuits and unifying feature-based and circuit-based perspectives on model behavior.

Florent Draye, Anson Lei, Hsiao-Ru Pan + 2 more2026-03-06💻 cs

ClinNoteAgents: An LLM Multi-Agent System for Predicting and Interpreting Heart Failure 30-Day Readmission from Clinical Notes

ClinNoteAgents is a novel LLM-based multi-agent system that effectively predicts and interprets 30-day heart failure readmission risks by transforming unstructured clinical notes into structured risk factors and clinician-style abstractions, offering a scalable and interpretable solution for data-limited healthcare settings.

Rongjia Zhou, Chengzhuo Li, Carl Yang + 1 more2026-03-06💻 cs

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

The paper introduces InternGeometry, an LLM agent enhanced by Complexity-Boosting Reinforcement Learning and a dynamic memory mechanism that iteratively proposes and verifies auxiliary constructions, achieving a medalist-level performance on IMO geometry problems with significantly less training data than previous expert models.

Haiteng Zhao, Junhao Shen, Yiming Zhang + 7 more2026-03-06💻 cs

ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

ReFusion introduces a novel masked diffusion model that integrates sequence reorganization with a hybrid parallel-autoregressive decoding strategy to simultaneously achieve full KV cache efficiency, reduce learning complexity, and significantly outperform existing diffusion models while narrowing the performance gap with autoregressive models.

Jia-Nan Li, Jian Guan, Wei Wu + 1 more2026-03-06💻 cs

HydroGEM: A Self Supervised Zero Shot Hybrid TCN Transformer Foundation Model for Continental Scale Streamflow Quality Control

HydroGEM is a self-supervised, zero-shot hybrid TCN-Transformer foundation model that effectively performs continental-scale streamflow quality control by detecting and reconstructing sensor anomalies with high accuracy and cross-national generalization, thereby addressing the scalability limitations of manual hydrological data validation.

Ijaz Ul Haq, Byung Suk Lee, Julia N. Perdrial + 1 more2026-03-06💻 cs

RePo: Language Models with Context Re-Positioning

This paper introduces RePo, a novel mechanism that leverages a differentiable module to dynamically re-position tokens based on contextual dependencies rather than fixed linear indices, thereby reducing extraneous cognitive load and enhancing LLM performance on tasks involving noisy contexts, structured data, and long-range dependencies.

Huayang Li, Tianyu Zhao, Deng Cai + 1 more2026-03-06💻 cs

← Previous Next →