cs.AI papers | Gist.Science

Neural Dynamics-Informed Pre-trained Framework for Personalized Brain Functional Network Construction

This paper proposes a neural dynamics-informed pre-trained framework that overcomes the limitations of traditional atlas-based methods by extracting personalized neural activity representations to guide brain parcellation and correlation estimation, thereby achieving superior performance in constructing personalized brain functional networks across heterogeneous scenarios.

Hongjie Jiang, Yifei Tang, Shuqiang Wang2026-03-10🤖 cs.LG

How Long Can Unified Multimodal Models Generate Images Reliably? Taming Long-Horizon Interleaved Image Generation via Context Curation

This paper introduces UniLongGen, a training-free inference strategy that improves long-horizon interleaved image generation by dynamically curating context to discard accumulated visual noise, thereby overcoming the reliability collapse caused by dense visual token interference in unified multimodal models.

Haoyu Chen, Qing Liu, Yuqian Zhou, He Zhang, Zhaowen Wang, Mengwei Ren, Jingjing Ren, Xiang Wang, Zhe Lin, Lei Zhu2026-03-10💻 cs

DreamSAC: Learning Hamiltonian World Models via Symmetry Exploration

DreamSAC is a framework that enhances extrapolative generalization in physics simulations by combining an unsupervised symmetry exploration strategy, which actively probes conservation laws via a Hamiltonian-based curiosity bonus, with a Hamiltonian-based world model that learns invariant physical states from raw observations through a novel contrastive objective.

Jinzhou Tang, Fan Feng, Minghao Fu, Wenjun Lin, Biwei Huang, Keze Wang2026-03-10🤖 cs.LG

COOL-MC: Verifying and Explaining RL Policies for Multi-bridge Network Maintenance

The paper introduces COOL-MC, a framework that verifies and explains reinforcement learning policies for multi-bridge network maintenance by applying probabilistic model checking and explainability methods to a PRISM-encoded Markov decision process, thereby revealing safety violation probabilities and systematic policy biases.

Dennis Gross2026-03-10🤖 cs.LG

Learning-free L2-Accented Speech Generation using Phonological Rules

This paper proposes a learning-free text-to-speech framework that generates L2-accented speech by applying phonological rules to phoneme sequences within a multilingual TTS model, enabling explicit accent control without requiring large-scale accented training datasets.

Thanathai Lertpetchpun, Yoonjeong Lee, Jihwan Lee, Tiantian Feng, Dani Byrd, Shrikanth Narayanan2026-03-10💬 cs.CL

Targeted Speaker Poisoning Framework in Zero-Shot Text-to-Speech

This paper introduces a novel Speech Generation Speaker Poisoning (SGSP) framework to address privacy risks in zero-shot text-to-speech by modifying trained models to prevent the generation of specific speaker identities while maintaining utility for others, demonstrating effective protection for up to 15 speakers but revealing scalability challenges with larger sets due to identity overlap.

Thanapat Trachu, Thanathai Lertpetchpun, Sai Praneeth Karimireddy, Shrikanth Narayanan2026-03-10💻 cs

Nw\=ach\=a Mun\=a: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

This paper introduces Nw\=ach\=a Mun\=a, the first manually transcribed Devanagari speech corpus for the endangered Nepal Bhasha, and demonstrates that proximal cross-lingual transfer from Nepali achieves competitive automatic speech recognition performance comparable to large multilingual models while being significantly more computationally efficient.

Rishikesh Kumar Sharma, Safal Narshing Shrestha, Jenny Poudel, Rupak Tiwari, Arju Shrestha, Rupak Raj Ghimire, Bal Krishna Bal2026-03-10💬 cs.CL

GRD-Net: Generative-Reconstructive-Discriminative Anomaly Detection with Region of Interest Attention Module

The paper proposes GRD-Net, a novel architecture combining a generative adversarial network with a region-of-interest attention module to improve industrial surface anomaly detection and localization by learning from normal products and synthetic defects while focusing on relevant areas, thereby reducing reliance on biased post-processing algorithms.

Niccolò Ferrari, Michele Fraccaroli, Evelina Lamma2026-03-10🤖 cs.LG

A Systematic Comparison of Training Objectives for Out-of-Distribution Detection in Image Classification

This paper systematically evaluates four training objectives—Cross-Entropy, Prototype, Triplet, and Average Precision Losses—for out-of-distribution detection in image classification, revealing that while they achieve comparable in-distribution accuracy, Cross-Entropy Loss delivers the most consistent performance across both near- and far-OOD scenarios under standardized protocols.

Furkan Genç, Onat Özdemir, Emre Akbas2026-03-10🤖 cs.LG

Integration of deep generative Anomaly Detection algorithm in high-speed industrial line

This paper presents a semi-supervised deep generative anomaly detection framework, utilizing a residual autoencoder with a dense bottleneck, that achieves high-accuracy, real-time defect detection and localization on high-speed pharmaceutical Blow-Fill-Seal production lines while operating within strict 500 ms timing constraints.

Niccolò Ferrari, Nicola Zanarini, Michele Fraccaroli, Alice Bizzarri, Evelina Lamma2026-03-10🤖 cs.LG

Shorter Thoughts, Same Answers: Difficulty-Scaled Segment-Wise RL for CoT Compression

The paper proposes Difficulty-Scaled Segment-Wise GRPO (DSS-GRPO), a reinforcement learning method that decomposes training signals into separate "think" and "answer" segments with difficulty-aware scaling to compress reasoning traces without compromising answer quality.

Ye Tian, Aijun Liu2026-03-10🤖 cs.LG

SMAT: Staged Multi-Agent Training for Co-Adaptive Exoskeleton Control

The paper proposes Staged Multi-Agent Training (SMAT), a four-stage curriculum that progressively trains a human-exoskeleton system to achieve stable co-adaptation, resulting in a control policy that significantly reduces hip muscle activation and delivers consistent, positive mechanical power across diverse users without requiring subject-specific retraining.

Yifei Yuan, Ghaith Androwis, Xianlian Zhou2026-03-10🤖 cs.LG

Evaluating Synthetic Data for Baggage Trolley Detection in Airport Logistics

This paper proposes a high-fidelity synthetic data generation pipeline using NVIDIA Omniverse to address data scarcity and privacy constraints in airport logistics, demonstrating that mixed training with synthetic data and only 40% of real annotations achieves performance comparable to full real-data baselines while reducing annotation effort by 25–35%.

Abdeldjalil Taibi, Mohmoud Badlis, Amina Bensalem, Belkacem Zouilekh, Mohammed Brahimi2026-03-10🤖 cs.LG

AtomicVLA: Unlocking the Potential of Atomic Skill Learning in Robots

The paper proposes AtomicVLA, a unified planning-and-execution framework that utilizes a Skill-Guided Mixture-of-Experts architecture to dynamically compose atomic skill abstractions, thereby significantly improving scalability and performance in long-horizon robotic manipulation and continual learning tasks compared to existing monolithic VLA models.

Likui Zhang, Tao Tang, Zhihao Zhan, Xiuwei Chen, Zisheng Chen, Jianhua Han, Jiangtong Zhu, Pei Xu, Hang Xu, Hefeng Wu, Liang Lin, Xiaodan Liang2026-03-10💻 cs

Ref-DGS: Reflective Dual Gaussian Splatting

Ref-DGS is an efficient, rasterization-based framework that achieves state-of-the-art novel view synthesis on reflective scenes by decoupling surface geometry from specular reflections using a dual Gaussian representation and a lightweight adaptive mixing shader, thereby avoiding the high computational cost of explicit ray tracing.

Ningjing Fan, Yiqun Wang, Dongming Yan, Peter Wonka2026-03-10💻 cs

AI-Driven Phase Identification from X-ray Hyperspectral Imaging of cycled Na-ion Cathode Materials

This paper presents an AI-driven workflow combining a Gaussian mixture variational autoencoder with Pearson correlation coefficients to analyze sparsely sampled X-ray hyperspectral data, enabling the generation of nanometer-resolution multiphase maps that reveal complex phase heterogeneity and transition zones in individual Na-ion cathode particles during electrochemical cycling.

Fayçal Adrar, Nicolas Folastre, Chloé Pablos, Stefan Stanescu, Sufal Swaraj, Raghvender Raghvender, François Cadiou, Laurence Croguennec, Matthieu Bugnet, Arnaud Demortière2026-03-10🔬 cond-mat.mtrl-sci

Memory for Autonomous LLM Agents:Mechanisms, Evaluation, and Emerging Frontiers

This survey provides a comprehensive overview of memory mechanisms in autonomous LLM agents from 2022 to early 2026, formalizing a write–manage–read framework, introducing a three-dimensional taxonomy, analyzing key mechanisms and evaluation benchmarks, and outlining critical applications and future challenges.

Pengfei Du2026-03-10💻 cs

Compressed-Domain-Aware Online Video Super-Resolution

This paper proposes CDA-VSR, a compressed-domain-aware online video super-resolution network that leverages motion vectors, residual maps, and frame types to achieve real-time, high-quality reconstruction with significantly reduced computational cost compared to state-of-the-art methods.

Yuhang Wang, Hai Li, Shujuan Hou, Zhetao Dong, Xiaoyao Yang2026-03-10💻 cs

TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward

TDM-R1 introduces a novel reinforcement learning paradigm that enables few-step diffusion models to effectively incorporate non-differentiable rewards by decoupling surrogate reward learning from generator training, achieving state-of-the-art performance across various metrics and scaling to powerful models like Z-Image with only 4 inference steps.

Yihong Luo, Tianyang Hu, Weijian Luo, Jing Tang2026-03-10💻 cs

VoiceSHIELD-Small: Real-Time Malicious Speech Detection and Transcription

VoiceSHIELD-Small is a lightweight, real-time model built on Whisper-small that simultaneously transcribes speech and detects malicious content with 99.16% accuracy, offering a faster and more secure alternative to traditional text-based filtering for voice AI systems.

Sumit Ranjan, Sugandha Sharma, Ubaid Abbas, Puneeth N Ail2026-03-10💻 cs

← Previous Next →