cs papers | Gist.Science

A General Lie-Group Framework for Continuum Soft Robot Modeling

This paper presents a unified Lie-group framework based on Cosserat rod theory and SE(3) cumulative parameterization that overcomes existing modeling limitations to provide efficient, constraint-free analytical expressions for the kinematics, statics, and dynamics of diverse continuum soft robotic structures.

Lingxiao Xun, Benoît Rosa, Jérôme Szewczyk, Brahim Tamadazte2026-03-10💻 cs

Exploring Deep Learning and Ultra-Widefield Imaging for Diabetic Retinopathy and Macular Edema

This study leverages the MICCAI 2024 UWF4DR dataset to benchmark state-of-the-art deep learning models, including CNNs, Vision Transformers, and foundation models, in both spatial and frequency domains for image quality assessment, referable diabetic retinopathy detection, and diabetic macular edema identification using ultra-widefield imaging, demonstrating that feature-level fusion and frequency-domain representations yield robust and explainable results.

Pablo Jimenez-Lizcano, Sergio Romero-Tapiador, Ruben Tolosana, Aythami Morales, Guillermo González de Rivera, Ruben Vera-Rodriguez, Julian Fierrez2026-03-10💻 cs

Why Learn What Physics Already Knows? Realizing Agile mmWave-based Human Pose Estimation via Physics-Guided Preprocessing

This paper proposes a physics-guided preprocessing framework for millimeter-wave human pose estimation that explicitly models signal correlations and kinematics to achieve real-time, lightweight performance with significantly fewer parameters than existing data-driven baselines while maintaining competitive accuracy.

Shuntian Zheng, Jiaqi Li, Minzhe Ni, Xiaoman Lu, Yu Guan2026-03-10💻 cs

SiMO: Single-Modality-Operable Multimodal Collaborative Perception

SiMO introduces a novel collaborative perception framework that utilizes Length-Adaptive Multi-Modal Fusion (LAMMA) and a "Pretrain-Align-Fuse-RD" training strategy to overcome sensor failures and semantic mismatches, ensuring robust performance across all individual modalities while maintaining effective multimodal integration.

Jiageng Wen, Shengjie Zhao, Bing Li, Jiafeng Huang, Kenan Ye, Hao Deng2026-03-10💻 cs

Topologically Stable Hough Transform

This paper proposes a topologically stable variant of the Hough transform for line detection in point clouds, which replaces the traditional discretized voting scheme with a continuous score function and utilizes persistent homology to identify candidate lines via an efficient algorithm.

Stefan Huber, Kristóf Huszár, Michael Kerber, Martin Uray2026-03-10💻 cs

Coupling Europe's Capacity Markets

This paper proposes a novel flow-based conceptual design for coupling European capacity markets that, by harnessing cross-border capacity while respecting network constraints, reduces system costs and improves investment efficiency compared to isolated national mechanisms.

Kamal Adekola, Laurens de Vries, Kenneth Bruninx2026-03-10💻 cs

DynamicVGGT: Learning Dynamic Point Maps for 4D Scene Reconstruction in Autonomous Driving

This paper introduces DynamicVGGT, a unified feed-forward framework that extends static 3D perception to dynamic 4D scene reconstruction for autonomous driving by jointly predicting current and future point maps, utilizing a Motion-aware Temporal Attention module for temporal coherence, and employing a Dynamic 3D Gaussian Splatting Head to explicitly model point motion and refine geometry.

Zhuolin He, Jing Li, Guanghao Li, Xiaolei Chen, Jiacheng Tang, Siyang Zhang, Zhounan Jin, Feipeng Cai, Bin Li, Jian Pu, Jia Cai, Xiangyang Xue2026-03-10💻 cs

WaDi: Weight Direction-aware Distillation for One-step Image Synthesis

The paper proposes WaDi, a novel one-step image synthesis framework that leverages the insight that weight direction changes are more critical than norm changes during distillation, introducing the parameter-efficient LoRaD adapter to achieve state-of-the-art performance with only 10% of trainable parameters.

Lei Wang, Yang Cheng, Senmao Li, Ge Wu, Yaxing Wang, Jian Yang2026-03-10💻 cs

Seed2Scale: A Self-Evolving Data Engine for Embodied AI via Small to Large Model Synergy and Multimodal Evaluation

Seed2Scale is a self-evolving data engine that overcomes data bottlenecks in embodied AI by synergizing a lightweight "SuperTiny" model for robust data collection with a large Vision-Language Model for autonomous quality verification, enabling a target model to achieve a 131.2% performance improvement starting from just four seed demonstrations.

Cong Tai, Zhaoyu Zheng, Haixu Long, Hansheng Wu, Zhengbin Long, Haodong Xiang, Rong Shi, Zhuo Cui, Shizhuang Zhang, Gang Qiu, He Wang, Ruifeng Li, Biao Liu, Zhenzhe Sun, Tao Shen2026-03-10💻 cs

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

The paper introduces FinToolBench, the first real-world, runnable benchmark that evaluates LLM agents on 760 executable financial tools using a novel framework assessing timeliness, intent, and regulatory compliance, alongside a proposed finance-aware baseline named FATR to advance trustworthy agentic AI in finance.

Jiaxuan Lu, Kong Wang, Yemin Wang, Qingmei Tang, Hongwei Zeng, Xiang Chen, Jiahao Pi, Shujian Deng, Lingzhi Chen, Yi Fu, Kehua Yang, Xiao Sun2026-03-10💻 cs

Event-based Motion & Appearance Fusion for 6D Object Pose Tracking

This paper proposes a learning-free method for 6D object pose tracking that fuses event-based optical flow for high-speed pose propagation with a template-based correction strategy, demonstrating superior performance over state-of-the-art algorithms in highly dynamic scenarios where traditional RGB-D cameras struggle with motion blur and low frame rates.

Zhichao Li, Chiara Bartolozzi, Lorenzo Natale, Arren Glover2026-03-10💻 cs

SAIL: Test-Time Scaling for In-Context Imitation Learning with VLM

SAIL is a test-time scaling framework that enhances one-shot robot imitation learning by reframing trajectory generation as an iterative refinement process guided by Monte Carlo Tree Search, an automated retrieval archive, and a vision-language model-based scoring mechanism, thereby significantly improving success rates across diverse manipulation tasks.

Makoto Sato, Yusuke Iwasawa, Yujin Tang, So Kuroki2026-03-10💻 cs

Prototype-Guided Concept Erasure in Diffusion Models

This paper introduces a prototype-guided approach that leverages the intrinsic embedding geometry of diffusion models to identify and cluster concept representations, enabling the reliable erasure of broad, multi-faceted concepts while preserving overall image quality.

Yuze Cai, Jiahao Lu, Hongxiang Shi, Yichao Zhou, Hong Lu2026-03-10💻 cs

Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

This paper demonstrates that explicitly reducing observation dimensionality and implementing locality-aware credit assignment in a communication-free multi-agent system enhances robustness and performance in asymmetric 3D pursuit-evasion tasks within cluttered environments.

Jialin Ying, Zhihao Li, Zicheng Dong, Guohua Wu, Yihuan Liao2026-03-10💻 cs

OSCAR: Occupancy-based Shape Completion via Acoustic Neural Implicit Representations

The paper proposes OSCAR, a label-free method that utilizes coupled latent spaces and neural implicit representations to accurately reconstruct complete 3D vertebral anatomy from partial ultrasound images by implicitly modeling acoustic shadowing and signal transmission, achieving an 80% improvement in HD95 score over state-of-the-art techniques.

Magdalena Wysocki, Kadir Burak Buldu, Miruna-Alexandra Gafencu, Mohammad Farid Azampour, Nassir Navab2026-03-10💻 cs

A Blockchain-based Traceability System for AI-Driven Engine Blade Inspection

This paper presents BladeChain, a blockchain-based system that integrates multi-stakeholder endorsement, automated scheduling, and AI model provenance to provide immutable, auditable traceability for aircraft engine blade inspections across the entire component life cycle.

Mahmoud Hafez, Eman Ouda, Mohammed A. Mohammed Eltoum, Khaled Salah, Yusra Abdulrahman2026-03-10💻 cs

Novel Semantic Prompting for Zero-Shot Action Recognition

The paper introduces SP-CLIP, a lightweight zero-shot action recognition framework that significantly improves performance on fine-grained and compositional actions by augmenting frozen vision-language models with structured, multi-level semantic prompts without requiring any additional parameter training or visual encoder modifications.

Salman Iqbal, Waheed Rehman2026-03-10💻 cs

Deconstructing Multimodal Mathematical Reasoning: Towards a Unified Perception-Alignment-Reasoning Paradigm

This paper systematically reviews recent advancements in Multimodal Mathematical Reasoning by proposing a unified Perception-Alignment-Reasoning paradigm, categorizing existing approaches around four fundamental questions regarding information extraction, representation, reasoning, and evaluation, while outlining future research challenges.

Tianyu Yang, Sihong Wu, Yilun Zhao, Zhenwen Liang, Lisen Dai, Chen Zhao, Minhao Cheng, Arman Cohan, Xiangliang Zhang2026-03-10💻 cs

Silicone Ethernet (SEth): a Nervous System for Robotic Touch

This paper introduces Silicone Ethernet (SEth), a wireless, battery-free system that embeds sensing, communication, and power transfer capabilities directly into a conductive silicone substrate to overcome the cabling limitations of fine-grained robotic touch sensing.

Mengyao Liu, Dag Malstaf, Jonathan Oostvogels, Sam Michiels, Alexander Badri-Spröwitz, Danny Hughes2026-03-10💻 cs

SoK: Harmonizing Attack Graphs and Intrusion Detection Systems

This paper presents the first systematic analysis of Attack Graph and Intrusion Detection System integration, proposing a novel unifying lifecycle framework that establishes a continuous feedback loop to enhance threat detection and incident response.

Andrea Agiollo, Enkeleda Bardhi, Alessandro Palma, Riccardo Lazzeretti, Silvia Bonomi, Fernando Kuipers2026-03-10💻 cs

← Previous Next →