cs papers | Gist.Science

Re-evaluating Position and Velocity Decoding for Hand Pose Estimation with Surface Electromyography

This paper revises the prevailing conclusion that velocity decoding outperforms position decoding for sEMG-based hand pose estimation by demonstrating that, with a stable training recipe and a causal speed-adaptive filter, position decoding achieves superior tracking accuracy and a better smoothness-accuracy tradeoff across generalization conditions.

Nima Hadidi, Johannes Lee, Ebrahim Feghhi, Michael Yuan, Jonathan C. Kao2026-03-10💻 cs

A Comparative Study of Recent Advances in Internet of Intrusion Detection Things

This paper presents a comprehensive comparative study of recent advances in Internet of Things (IoT) intrusion detection systems, analyzing their architectures, classifications, and evaluation methodologies to address critical security challenges.

Marianna Rezk (IRIMAS), Hassan Harb (IRIMAS), Ismail Bennis (IRIMAS), Sebastien Bindel (IRIMAS), Hafid Abouaissa (IRIMAS)2026-03-10💻 cs

SplitAgent: A Privacy-Preserving Distributed Architecture for Enterprise-Cloud Agent Collaboration

SplitAgent introduces a novel distributed architecture that enables privacy-preserving collaboration between enterprise and cloud AI agents by utilizing context-aware dynamic sanitization, differential privacy, and zero-knowledge verification to achieve high task accuracy while significantly reducing data leakage compared to static approaches.

Jianshu She2026-03-10💻 cs

SAVE: Speech-Aware Video Representation Learning for Video-Text Retrieval

The paper proposes SAVE, a speech-aware video representation learning method that enhances video-text retrieval by introducing a dedicated speech branch and soft-ALBEF for early vision-audio alignment, achieving state-of-the-art performance across five benchmarks.

Ruixiang Zhao, Zhihao Xu, Bangxiang Lan, Zijie Xin, Jingyu Liu, Xirong Li2026-03-10💻 cs

Practical Type Inference: High-Throughput Recovery of Real-World Structures and Function Signatures

The paper introduces XTRIDE, a highly optimized n-gram-based approach that achieves state-of-the-art accuracy in recovering real-world structure layouts and function signatures from stripped binaries while offering throughput speeds 70 to 2300 times faster than existing methods, making it suitable for automated reverse engineering pipelines.

Lukas Seidel, Sam Thomas, Konrad Rieck2026-03-10💻 cs

SRNeRV: A Scale-wise Recursive Framework for Neural Video Representation

SRNeRV is a novel, parameter-efficient neural video representation framework that leverages scale self-similarity through a hybrid recursive sharing scheme to significantly reduce model size while achieving superior rate-distortion performance compared to traditional stacked multi-scale architectures.

Jia Wang, Jun Zhu, Xinfeng Zhang2026-03-10💻 cs

GarmentPainter: Efficient 3D Garment Texture Synthesis with Character-Guided Diffusion Model

GarmentPainter is an efficient framework that synthesizes high-fidelity, 3D-consistent garment textures in UV space by leveraging UV position maps for structural guidance and a type selection module for character-based control, all integrated into a standard diffusion model without architectural modifications.

Jinbo Wu, Xiaobo Gao, Xing Liu, Chen Zhao, Jialun Liu2026-03-10💻 cs

Disentangling Reasoning in Large Audio-Language Models for Ambiguous Emotion Prediction

This paper introduces a systematic framework for Large Audio-Language Models that reformulates ambiguous emotion recognition as a distributional reasoning problem, utilizing an ambiguity-aware objective and structured chain-of-thought supervision to significantly improve performance on standard benchmarks.

Xiaofeng Yu, Jiaheng Dong, Jean Honorio, Abhirup Ghosh, Hong Jia, Ting Dang2026-03-10💻 cs

A General Lie-Group Framework for Continuum Soft Robot Modeling

This paper presents a unified Lie-group framework based on Cosserat rod theory and SE(3) cumulative parameterization that overcomes existing modeling limitations to provide efficient, constraint-free analytical expressions for the kinematics, statics, and dynamics of diverse continuum soft robotic structures.

Lingxiao Xun, Benoît Rosa, Jérôme Szewczyk, Brahim Tamadazte2026-03-10💻 cs

Exploring Deep Learning and Ultra-Widefield Imaging for Diabetic Retinopathy and Macular Edema

This study leverages the MICCAI 2024 UWF4DR dataset to benchmark state-of-the-art deep learning models, including CNNs, Vision Transformers, and foundation models, in both spatial and frequency domains for image quality assessment, referable diabetic retinopathy detection, and diabetic macular edema identification using ultra-widefield imaging, demonstrating that feature-level fusion and frequency-domain representations yield robust and explainable results.

Pablo Jimenez-Lizcano, Sergio Romero-Tapiador, Ruben Tolosana, Aythami Morales, Guillermo González de Rivera, Ruben Vera-Rodriguez, Julian Fierrez2026-03-10💻 cs

Why Learn What Physics Already Knows? Realizing Agile mmWave-based Human Pose Estimation via Physics-Guided Preprocessing

This paper proposes a physics-guided preprocessing framework for millimeter-wave human pose estimation that explicitly models signal correlations and kinematics to achieve real-time, lightweight performance with significantly fewer parameters than existing data-driven baselines while maintaining competitive accuracy.

Shuntian Zheng, Jiaqi Li, Minzhe Ni, Xiaoman Lu, Yu Guan2026-03-10💻 cs

SiMO: Single-Modality-Operable Multimodal Collaborative Perception

SiMO introduces a novel collaborative perception framework that utilizes Length-Adaptive Multi-Modal Fusion (LAMMA) and a "Pretrain-Align-Fuse-RD" training strategy to overcome sensor failures and semantic mismatches, ensuring robust performance across all individual modalities while maintaining effective multimodal integration.

Jiageng Wen, Shengjie Zhao, Bing Li, Jiafeng Huang, Kenan Ye, Hao Deng2026-03-10💻 cs

Topologically Stable Hough Transform

This paper proposes a topologically stable variant of the Hough transform for line detection in point clouds, which replaces the traditional discretized voting scheme with a continuous score function and utilizes persistent homology to identify candidate lines via an efficient algorithm.

Stefan Huber, Kristóf Huszár, Michael Kerber, Martin Uray2026-03-10💻 cs

Coupling Europe's Capacity Markets

This paper proposes a novel flow-based conceptual design for coupling European capacity markets that, by harnessing cross-border capacity while respecting network constraints, reduces system costs and improves investment efficiency compared to isolated national mechanisms.

Kamal Adekola, Laurens de Vries, Kenneth Bruninx2026-03-10💻 cs

DynamicVGGT: Learning Dynamic Point Maps for 4D Scene Reconstruction in Autonomous Driving

This paper introduces DynamicVGGT, a unified feed-forward framework that extends static 3D perception to dynamic 4D scene reconstruction for autonomous driving by jointly predicting current and future point maps, utilizing a Motion-aware Temporal Attention module for temporal coherence, and employing a Dynamic 3D Gaussian Splatting Head to explicitly model point motion and refine geometry.

Zhuolin He, Jing Li, Guanghao Li, Xiaolei Chen, Jiacheng Tang, Siyang Zhang, Zhounan Jin, Feipeng Cai, Bin Li, Jian Pu, Jia Cai, Xiangyang Xue2026-03-10💻 cs

WaDi: Weight Direction-aware Distillation for One-step Image Synthesis

The paper proposes WaDi, a novel one-step image synthesis framework that leverages the insight that weight direction changes are more critical than norm changes during distillation, introducing the parameter-efficient LoRaD adapter to achieve state-of-the-art performance with only 10% of trainable parameters.

Lei Wang, Yang Cheng, Senmao Li, Ge Wu, Yaxing Wang, Jian Yang2026-03-10💻 cs

Seed2Scale: A Self-Evolving Data Engine for Embodied AI via Small to Large Model Synergy and Multimodal Evaluation

Seed2Scale is a self-evolving data engine that overcomes data bottlenecks in embodied AI by synergizing a lightweight "SuperTiny" model for robust data collection with a large Vision-Language Model for autonomous quality verification, enabling a target model to achieve a 131.2% performance improvement starting from just four seed demonstrations.

Cong Tai, Zhaoyu Zheng, Haixu Long, Hansheng Wu, Zhengbin Long, Haodong Xiang, Rong Shi, Zhuo Cui, Shizhuang Zhang, Gang Qiu, He Wang, Ruifeng Li, Biao Liu, Zhenzhe Sun, Tao Shen2026-03-10💻 cs

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

The paper introduces FinToolBench, the first real-world, runnable benchmark that evaluates LLM agents on 760 executable financial tools using a novel framework assessing timeliness, intent, and regulatory compliance, alongside a proposed finance-aware baseline named FATR to advance trustworthy agentic AI in finance.

Jiaxuan Lu, Kong Wang, Yemin Wang, Qingmei Tang, Hongwei Zeng, Xiang Chen, Jiahao Pi, Shujian Deng, Lingzhi Chen, Yi Fu, Kehua Yang, Xiao Sun2026-03-10💻 cs

Event-based Motion & Appearance Fusion for 6D Object Pose Tracking

This paper proposes a learning-free method for 6D object pose tracking that fuses event-based optical flow for high-speed pose propagation with a template-based correction strategy, demonstrating superior performance over state-of-the-art algorithms in highly dynamic scenarios where traditional RGB-D cameras struggle with motion blur and low frame rates.

Zhichao Li, Chiara Bartolozzi, Lorenzo Natale, Arren Glover2026-03-10💻 cs

SAIL: Test-Time Scaling for In-Context Imitation Learning with VLM

SAIL is a test-time scaling framework that enhances one-shot robot imitation learning by reframing trajectory generation as an iterative refinement process guided by Monte Carlo Tree Search, an automated retrieval archive, and a vision-language model-based scoring mechanism, thereby significantly improving success rates across diverse manipulation tasks.

Makoto Sato, Yusuke Iwasawa, Yujin Tang, So Kuroki2026-03-10💻 cs

← Previous Next →