cs papers | Gist.Science

How Long Can Unified Multimodal Models Generate Images Reliably? Taming Long-Horizon Interleaved Image Generation via Context Curation

This paper introduces UniLongGen, a training-free inference strategy that improves long-horizon interleaved image generation by dynamically curating context to discard accumulated visual noise, thereby overcoming the reliability collapse caused by dense visual token interference in unified multimodal models.

Haoyu Chen, Qing Liu, Yuqian Zhou, He Zhang, Zhaowen Wang, Mengwei Ren, Jingjing Ren, Xiang Wang, Zhe Lin, Lei Zhu2026-03-10💻 cs

CONSTANT: Towards High-Quality One-Shot Handwriting Generation with Patch Contrastive Enhancement and Style-Aware Quantization

The paper introduces CONSTANT, a novel one-shot handwriting generation framework that leverages Style-Aware Quantization and a latent patch-based contrastive objective within a diffusion model to overcome existing limitations in capturing diverse writer styles and generating high-quality, realistic handwritten images across multiple languages.

Anh-Duy Le, Van-Linh Pham, Thanh-Nam Vo, Xuan Toan Mai, Tuan-Anh Tran2026-03-10💻 cs

Evaluating Parkinson's Disease Detection in Anonymized Speech: A Performance and Acoustic Analysis

This paper evaluates the trade-off between privacy and Parkinson's disease detection in anonymized speech, demonstrating that while STT-TTS anonymization severely degrades diagnostic performance by erasing prosodic cues, kNN-VC effectively preserves macro-prosodic features to maintain high detection accuracy with only a minor performance drop.

Carlos Franzreb, Francisco Teixeira, Ben Luks, Sebastian Möller, Alberto Abad2026-03-10💻 cs

Targeted Speaker Poisoning Framework in Zero-Shot Text-to-Speech

This paper introduces a novel Speech Generation Speaker Poisoning (SGSP) framework to address privacy risks in zero-shot text-to-speech by modifying trained models to prevent the generation of specific speaker identities while maintaining utility for others, demonstrating effective protection for up to 15 speakers but revealing scalability challenges with larger sets due to identity overlap.

Thanapat Trachu, Thanathai Lertpetchpun, Sai Praneeth Karimireddy, Shrikanth Narayanan2026-03-10💻 cs

ReconDrive: Fast Feed-Forward 4D Gaussian Splatting for Autonomous Driving Scene Reconstruction

ReconDrive is a fast, feed-forward framework that adapts the VGGT foundation model with hybrid prediction heads and static-dynamic composition to achieve high-fidelity, scalable 4D Gaussian Splatting for autonomous driving scenes, outperforming existing feed-forward methods while matching the quality of slower optimization-based approaches.

Haibao Yu, Kuntao Xiao, Jiahang Wang, Ruiyang Hao, Yuxin Huang, Guoran Hu, Haifang Qin, Bowen Jing, Yuntian Bo, Ping Luo2026-03-10💻 cs

A symmetric recursive algorithm for mean-payoff games

This paper introduces a new deterministic symmetric recursive algorithm designed to solve mean-payoff games.

Pierre Ohlmann2026-03-10💻 cs

AgentRaft: Automated Detection of Data Over-Exposure in LLM Agents

This paper introduces AgentRaft, an automated framework that combines program analysis and semantic reasoning to detect and quantify the systemic risk of Data Over-Exposure in LLM agents, demonstrating high accuracy and efficiency across thousands of real-world tools.

Yixi Lin (Sun Yat-sen University, Zhuhai, Guangdong, China), Jiangrong Wu (Sun Yat-sen University, Zhuhai, Guangdong, China), Yuhong Nan (Sun Yat-sen University, Zhuhai, Guangdong, China), Xueqiang Wang (University of Central Florida, Orlando, Florida, USA), Xinyuan Zhang (Sun Yat-sen University, Zhuhai, Guangdong, China), Zibin Zheng (Sun Yat-sen University, Zhuhai, Guangdong, China)2026-03-10💻 cs

Active Inference for Micro-Gesture Recognition: EFE-Guided Temporal Sampling and Adaptive Learning

This paper proposes an active inference-based framework for micro-gesture recognition that utilizes Expected Free Energy-guided temporal sampling and uncertainty-aware adaptive learning to overcome challenges like low amplitude, noise, and inter-subject variability, demonstrating significant performance improvements on the SMG dataset.

Weijia Feng, Jingyu Yang, Ruojia Zhang, Fengtao Sun, Qian Gao, Chenyang Wang, Tongtong Su, Jia Guo, Xiaobai Li, Minglai Shao2026-03-10💻 cs

Learning the APT Kill Chain: Temporal Reasoning over Provenance Data for Attack Stage Estimation

This paper introduces StageFinder, a temporal graph learning framework that fuses host and network provenance data with graph neural networks and LSTMs to accurately and stably estimate Advanced Persistent Threat (APT) attack stages, achieving a macro F1-score of 0.96 on DARPA datasets.

Trung V. Phan, Thomas Bauschert2026-03-10💻 cs

PureCC: Pure Learning for Text-to-Image Concept Customization

PureCC is a novel concept customization framework that employs a decoupled learning objective and a dual-branch training pipeline to achieve high-fidelity text-to-image personalization while effectively preserving the original model's behavior and capabilities.

Zhichao Liao, Xiaole Xian, Qingyu Li, Wenyu Qin, Meng Wang, Weicheng Xie, Siyang Song, Pingfa Feng, Long Zeng, Liang Pan2026-03-10💻 cs

Brain-WM: Brain Glioblastoma World Model

Brain-WM is a pioneering brain glioblastoma world model that utilizes a novel Y-shaped Mixture-of-Transformers architecture to unify next-step treatment prediction and future MRI generation, effectively capturing the co-evolutionary dynamics between tumor progression and treatment response to optimize clinical outcomes.

Chenhui Wang, Boyun Zheng, Liuxin Bao, Zhihao Peng, Peter Y. M. Woo, Hongming Shan, Yixuan Yuan2026-03-10💻 cs

SiamGM: Siamese Geometry-Aware and Motion-Guided Network for Real-Time Satellite Video Object Tracking

The paper proposes SiamGM, a real-time Siamese network for satellite video object tracking that integrates a geometry-aware Inter-Frame Graph Attention module and a motion-guided optimization strategy to effectively address challenges like small targets and occlusions while achieving 130 FPS without computational overhead.

Zixiao Wen, Zhen Yang, Jiawei Li, Xiantai Xiang, Guangyao Zhou, Yuxin Hu, Yuhan Liu2026-03-10💻 cs

Efficient RGB-D Scene Understanding via Multi-task Adaptive Learning and Cross-dimensional Feature Guidance

This paper proposes an efficient multi-task RGB-D scene understanding model that integrates an enhanced fusion encoder, specialized feature interaction layers, and a dynamic adaptive loss function to simultaneously perform semantic, instance, and panoptic segmentation, orientation estimation, and scene classification with improved accuracy and speed across multiple datasets.

Guodong Sun, Junjie Liu, Gaoyang Zhang, Bo Wu, Yang Zhang2026-03-10💻 cs

Approximate Imitation Learning for Event-based Quadrotor Flight in Cluttered Environments

This paper proposes an Approximate Imitation Learning framework that enables a quadrotor to fly at high speeds through cluttered environments using only a single event camera by training an end-to-end neural network with a large offline dataset and lightweight state simulations, thereby avoiding the computational cost of rendering synthetic event data while achieving robust real-world performance.

Nico Messikommer, Jiaxu Xing, Leonard Bauersfeld, Marco Cannici, Elie Aljalbout, Davide Scaramuzza2026-03-10💻 cs

FeasibleCap: Real-Time Embodiment Constraint Guidance for In-the-Wild Robot Demonstration Collection

FeasibleCap is a gripper-in-hand data collection system that provides real-time, hardware-free executability feedback through visual and haptic cues, enabling demonstrators to capture valid robot trajectories without headsets or learned dynamics models while improving replay success and preserving cross-embodiment transfer.

Zi Yin, Fanhong Li, Yun Gui, Jia Liu2026-03-10💻 cs

Model-Based and Neural-Aided Approaches for Dog Dead Reckoning

This paper introduces three neural-aided and model-based algorithms for dog dead reckoning (DDR) that utilize only inertial sensors to achieve accurate positioning with less than 10% absolute distance error for both biological and robotic dogs, validated through the newly created DogMotion dataset and a robotic legged dog dataset.

Gal Versano. Itai Savin, Itzik Klein2026-03-10💻 cs

AiRWeb: Using AR to Extend Web Browsing Beyond Handheld Screens

The paper introduces AiRWeb, a phone-based augmented reality system that allows users to seamlessly offload and organize arbitrary web content into their surrounding physical space to overcome mobile screen limitations, demonstrating its learnability and usability while highlighting design challenges in activation modes.

Mengfei Gao, Caroline Appert, Ludovic David, Emmanuel Pietriga2026-03-10💻 cs

3DGS-HPC: Distractor-free 3D Gaussian Splatting with Hybrid Patch-wise Classification

This paper proposes 3DGS-HPC, a novel framework that improves 3D Gaussian Splatting in real-world environments by replacing fragile semantic cues with a robust patch-wise classification strategy and a hybrid metric to effectively identify and suppress transient distractors like moving objects and shadows.

Jiahao Chen, Yipeng Qin, Ganlong Zhao, Xin Li, Wenping Wang, Guanbin Li2026-03-10💻 cs

On Factorization of Sparse Polynomials of Bounded Individual Degree

This paper presents deterministic polynomial-time algorithms and new structural bounds for factoring sparse polynomials with bounded individual degree, including recovering irreducible factors from blackbox access and improving upon previous runtime complexities for sparse polynomial factorization.

Aminadav Chuyoon, Amir Shpilka2026-03-10💻 cs

Fast Attention-Based Simplification of LiDAR Point Clouds for Object Detection and Classification

This paper proposes an efficient, end-to-end learned point cloud simplification method that combines feature embedding with attention-based sampling to achieve a superior balance between computational speed and accuracy for LiDAR-based object detection and classification compared to traditional sampling techniques.

Z. Rozsa, Á. Madaras, Q. Wei, X. Lu, M. Golarits, H. Yuan, T. Sziranyi, R. Hamzaoui2026-03-10💻 cs

← Previous Next →