cs papers | Gist.Science

ReViP: Mitigating False Completion in Vision-Language-Action Models with Vision-Proprioception Rebalance

This paper introduces ReViP, a novel Vision-Language-Action framework that mitigates "false completion" failures caused by proprioceptive bias through vision-proprioception rebalancing and a new benchmark suite, achieving significant performance gains over existing models.

Zhuohao Li, Yinghao Li, Jian-Jian Jiang, Lang Zhou, Tianyu Zhang, Jiadong Yin, Mu Lin, Yi-Kin Wei, Wei-Shi Zheng2026-03-10💻 cs

ScenePilot-Bench: A Large-Scale Dataset and Benchmark for Evaluation of Vision-Language Models in Autonomous Driving

This paper introduces ScenePilot-Bench, a large-scale benchmark built on the diverse ScenePilot-4K dataset to comprehensively evaluate and advance vision-language models in autonomous driving through multi-granularity annotations and a safety-aware, four-axis assessment framework.

Yujin Wang, Yutong Zheng, Wenxian Fan, Tianyi Wang, Hongqing Chu, Li Zhang, Bingzhao Gao, Daxin Tian, Jianqiang Wang, Hong Chen2026-03-10💻 cs

Query-Guided Spatial-Temporal-Frequency Interaction for Music Audio-Visual Question Answering

This paper proposes QSTar, a novel query-guided spatial-temporal-frequency interaction method enhanced by a Query Context Reasoning block, which significantly improves Audio-Visual Question Answering performance by deeply integrating question-guided clues and audio frequency characteristics with visual perception, outperforming existing multimodal approaches on multiple benchmarks.

Kun Li, Michael Ying Yang, Sami Sebastian Brandt2026-03-10💻 cs

Dynamic framework for edge-connectivity maintenance of simple graphs

This paper presents a dynamic framework for maintaining $k$ -edge-connectivity in undirected simple graphs under edge insertions and deletions by combining Nagamochi-Ibaraki sparse certificates with Link-Cut Trees for efficient $O(k \log n)$ amortized insertions and a maximum-flow-based approach for $O(k^{3/2} n^{3/2})$ deletions, all while keeping the graph sparse with $O(kn)$ edges.

Blazej Wrobel2026-03-10💻 cs

BioAgent Bench: An AI Agent Evaluation Suite for Bioinformatics

This paper introduces BioAgent Bench, a comprehensive evaluation suite and dataset for assessing AI agents in bioinformatics, which reveals that while frontier models can reliably construct multi-step pipelines, they lack robustness against perturbations and may be unsuitable for privacy-sensitive applications compared to open-weight alternatives.

Dionizije Fa, Marko Čuljak, Bruno Pandža, Mateo Čupic2026-03-10💻 cs

Real-Time Aligned Reward Model beyond Semantics

This paper introduces R2M, a novel lightweight RLHF framework that mitigates reward overoptimization by leveraging real-time policy hidden states to dynamically align the reward model with the policy's evolving distribution, rather than relying solely on static semantic representations.

Zixuan Huang, Xin Xia, Yuxi Ren, Jianbin Zheng, Xuefeng Xiao, Hongyan Xie, Li Huaqiu, Songshi Liang, Zhongxiang Dai, Fuzhen Zhuang, Jianxin Li, Yikun Ban, Deqing Wang2026-03-10💻 cs

Impact of LLMs news Sentiment Analysis on Stock Price Movement Prediction

This paper evaluates the impact of LLM-based news sentiment analysis on stock price prediction, demonstrating that DeBERTa outperforms other models and that an ensemble approach achieves 80% accuracy, while sentiment features provide modest improvements to various time-series forecasting architectures.

Walid Siala (SnT, University of Luxembourg, Luxembourg), Ahmed Khanfir (RIADI, ENSI, University of Manouba, Tunisia, SnT, University of Luxembourg, Luxembourg), Mike Papadakis (SnT, University of Luxembourg, Luxembourg)2026-03-10💻 cs

From Performers to Creators: Understanding Retired Women's Perceptions of Technology-Enhanced Dance Performance

Through co-design workshops with retired Chinese women, this paper demonstrates that age-sensitive interactive dance technologies and AI-generated content can lower technical barriers and transform these dancers from passive performers into empowered co-creators of their stage performances.

Danlin Zheng, Xiaoying Wei, Chao Liu, Quanyu Zhang, Jingling Zhang, Shihui Guo, Mingming Fan2026-03-10💻 cs

Cognitive-Flexible Control via Latent Model Reorganization with Predictive Safety Guarantees

This paper proposes a cognitive-flexible control framework that integrates an adaptive Deep Stochastic State-Space Model with Bayesian Model Predictive Control to ensure safety guarantees and rapid performance recovery in nonstationary cyber-physical systems through online latent representation reorganization.

Thanana Nuchkrua, Sudchai Boonto2026-03-10💻 cs

Green-VLA: Staged Vision-Language-Action Model for Generalist Robots

The paper introduces Green-VLA, a five-stage curriculum framework that combines large-scale multimodal pretraining, embodiment-specific adaptation, and reinforcement learning to enable a single generalist policy to robustly control diverse robotic systems, including the Green humanoid, with enhanced safety and long-horizon efficiency.

I. Apanasevich, M. Artemyev, R. Babakyan, P. Fedotova, D. Grankin, E. Kupryashin, A. Misailidi, D. Nerus, A. Nutalapati, G. Sidorov, I. Efremov, M. Gerasyov, D. Pikurov, Y. Senchenko, S. Davidenko, D. Kulikov, M. Sultankin, K. Askarbek, O. Shamanin, D. Statovoy, E. Zalyaev, I. Zorin, A. Letkin, E. Rusakov, A. Silchenko, V. Vorobyov, S. Sobolnikov, A. Postnikov2026-03-10💻 cs

Vulnerability-Amplifying Interaction Loops: a systematic failure mode in AI chatbot mental-health interactions

This paper introduces SIM-VAIL, a scalable auditing framework that reveals how consumer AI chatbots can systematically amplify mental health vulnerabilities through cumulative, context-dependent interaction loops, highlighting the need for multidimensional safety evaluations across diverse user phenotypes.

Veith Weilnhammer, Kevin YC Hou, Lennart Luettgau, Christopher Summerfield, Raymond Dolan, Matthew M Nour2026-03-10💻 cs

AgenticLab: A Real-World Robot Agent Platform that Can See, Think, and Act

This paper introduces AgenticLab, a real-world, model-agnostic robot agent platform and benchmark that utilizes a closed-loop pipeline to evaluate state-of-the-art vision-language models in unstructured environments, revealing critical failure modes in long-horizon manipulation that static evaluations miss.

Pengyuan Guo, Zhonghao Mai, Zhengtong Xu, Kaidi Zhang, Heng Zhang, Zichen Miao, Arash Ajoudani, Zachary Kingston, Qiang Qiu, Yu She2026-03-10💻 cs

Six Times to Spare: Characterizing GPU-Accelerated 5G LDPC Decoding for Edge-RSU Communications

This paper demonstrates that offloading 5G LDPC decoding to GPUs on compact edge platforms significantly improves throughput and reduces latency, thereby providing the necessary compute headroom to meet strict timing constraints for ultra-reliable low-latency vehicular communications.

Ryan Barker, Julia Boone, Tolunay Seyfi, Alireza Ebrahimi Dorcheh, Fatemeh Afghah, Joseph Boccuzzi2026-03-10💻 cs

Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

This paper introduces FSTab, a framework that demonstrates how LLM-generated software exhibits predictable, recurring vulnerabilities by enabling black-box attacks based on frontend features and quantifying the consistency of these flaws across different domains and model variations.

Tomer Kordonsky, Maayan Yamin, Noam Benzimra, Amit LeVi, Avi Mendelson2026-03-10💻 cs

LMMRec: LLM-driven Motivation-aware Multimodal Recommendation

This paper introduces LMMRec, a model-agnostic framework that leverages large language models and chain-of-thought prompting to extract fine-grained user and item motivations from heterogeneous text data, effectively aligning them with interaction signals to significantly improve multimodal recommendation performance.

Yicheng Di, Zhanjie Zhang, Yun Wang, Jinren Liu, Jiaqi Yan, Jiyu Wei, Xiangyu Chen, Yuan Liu2026-03-10💻 cs

Assessing Problem-Solving in HR Contexts: A Comparison Between Game-Based and Self-Report Measures

This study finds no significant convergence between self-reported and game-based behavioral measures of problem-solving, suggesting that these distinct modalities provide complementary rather than redundant information for personnel selection.

Fabrizio Fornari, Eleonora Cova, Niccolò Vito Vacca, Francesco Bocci, Marcello Sarini, Luigi Caputo2026-03-10💻 cs

Conditional Diffusion Guidance under Hard Constraint: A Stochastic Analysis Approach

This paper proposes a principled conditional diffusion guidance framework based on Doob's h-transform that enforces hard constraints without modifying pretrained score networks, introducing novel off-policy learning algorithms to estimate the necessary guidance terms and providing non-asymptotic convergence guarantees for the resulting sampler.

Zhengyi Guo, Wenpin Tang, Renyuan Xu2026-03-10💻 cs

Beyond Judgment: Exploring Large Language Models as Non-Judgmental Support for Maternal Mental Health

This mixed-methods study of 107 mothers reveals that while Large Language Models serve as valuable non-judgmental resources for emotional support and reassurance regarding childcare decisions, most users still prioritize human warmth, highlighting the technology's role as a low-risk supplement rather than a replacement for human connection.

Shayla Sharmin, Sadia Afrin Ratna2026-03-10💻 cs

NAAMSE: Framework for Evolutionary Security Evaluation of Agents

This paper introduces NAAMSE, an evolutionary framework that automates and enhances AI agent security evaluation by using a single autonomous agent to iteratively mutate prompts and explore corpora, thereby uncovering adaptive vulnerabilities missed by static benchmarks while ensuring the models maintain benign-use correctness.

Kunal Pai, Parth Shah, Harshil Patel2026-03-10💻 cs

PhysDrape: Learning Explicit Forces and Collision Constraints for Physically Realistic Garment Draping

PhysDrape is a hybrid neural-physical solver that combines a Physics-Informed Graph Neural Network with a differentiable two-stage force and collision projection system to achieve physically realistic garment draping with negligible interpenetration and superior fidelity compared to existing deep learning methods.

Minghai Chen, Mingyuan Liu, Ning Ma, Jianqing Li, Yuxiang Huan2026-03-10💻 cs

← Previous Next →