cs papers | Gist.Science

ROSER: Few-Shot Robotic Sequence Retrieval for Scalable Robot Learning

The paper introduces ROSER, a lightweight few-shot retrieval framework that extracts reusable, task-centric segments from unlabeled robotic logs using only 3-5 reference examples, thereby overcoming data scarcity by enabling scalable, high-accuracy utilization of large-scale continuous interaction datasets without task-specific training.

Zillur Rahman, Eddison Pham, Alejandro Daniel Noel, Cristian Meo2026-03-09💻 cs

FastLightGen: Fast and Light Video Generation with Fewer Steps and Parameters

FastLightGen is a novel algorithm that simultaneously compresses model parameters and reduces inference steps through an optimized teacher-student distillation framework, achieving state-of-the-art efficiency and visual quality in video generation with significantly fewer resources.

Shitong Shao, Yufei Gu, Zeke Xie2026-03-09💻 cs

VSearcher: Long-Horizon Multimodal Search Agent via Reinforcement Learning

This paper introduces VSearcher, a reinforcement learning-based multimodal search agent that transforms static models into capable long-horizon web browsers through an iterative data synthesis pipeline and an SFT-then-RL training strategy, achieving superior performance on the proposed MM-SearchExam benchmark.

Ruiyang Zhang, Qianguo Sun, Chao Song, Yiyan Qi, Zhedong Zheng2026-03-09💻 cs

Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models

This paper introduces Think-as-You-See (TaYS), a unified framework that enables concurrent, streaming Chain-of-Thought reasoning for Large Vision-Language Models by decoupling visual encoding from textual reasoning, thereby outperforming traditional batch and interleaved approaches in both accuracy and latency for real-time video understanding.

Jialiang Zhang, Junlong Tong, Junyan Lin, Hao Wu, Yirong Sun, Yunpu Ma, Xiaoyu Shen2026-03-09💻 cs

AI Researchers' Views on Automating AI R&D and Intelligence Explosions

A 2025 survey of 25 leading AI researchers reveals a consensus that automating AI research poses a severe and urgent risk due to the potential for recursive self-improvement, while highlighting significant disagreements on timelines, the likelihood of explosive growth, and the most effective governance strategies.

Severin Field, Raymond Douglas, David Krueger2026-03-09💻 cs

Scrambler: Mixed Boolean Arithmetic Obfuscation Tool Using E-graph and Equality Expansion

The paper introduces Scrambler, an e-graph-based tool that utilizes Equality Expansion to efficiently generate complex and diverse Mixed Boolean Arithmetic obfuscation expressions with guaranteed equivalence, demonstrating superior expressiveness and complexity compared to existing solutions.

Seoksu Lee, Sangjun An, Eun-Sun Cho2026-03-09💻 cs

Efficient Query Rewrite Rule Discovery via Standardized Enumeration and Learning-to-Rank(extend)

This paper presents SLER, a scalable system that combines standardized template enumeration with a learning-to-rank model to overcome the exponential search space and redundancy challenges of existing methods, successfully discovering over one million high-quality query rewrite rules for complex query plans.

Yuan Zhang, Yuxing Chen, Yuekun Yu, Jinbin Huang, Rui Mao, Anqun Pan, Lixiong Zheng, Jianbin Qin2026-03-09💻 cs

Publication and Maintenance of Relational Data in Enterprise Knowledge Graphs (Revised Version)

This paper proposes a formal framework, architecture, and algorithms for constructing and incrementally maintaining materialized RDB2RDF views to enable efficient, semantically integrated access to legacy relational data within Enterprise Knowledge Graphs.

Vânia Maria Ponte Vidal (Departamento de Computação, UFC, Fortaleza, Brazil), Valéria Magalhães Pequeno (TechLab, Departamento de Ciências e Tecnologias, UAL, Lisboa, Portugal), Marco Antonio Casanova (Instituto Tecgraf, Puc-Rio, Rio de Janeiro, Brazil), Narciso Arruda (Departamento de Computação, UFC, Fortaleza, Brazil), Carlos Brito (Departamento de Computação, UFC, Fortaleza, Brazil)2026-03-09💻 cs

XR and Hybrid Data Visualization Spaces for Enhanced Data Analytics

This paper advocates for the use of Extended Reality (XR) to create hybrid visualization spaces that seamlessly integrate 2D and 3D representations, offering a solution to the challenges of high-dimensional data and AI interpretability through three supporting case studies.

Santiago Lombeyda, S. G. Djorgovski, Ciro Donalek2026-03-09💻 cs

Biometric-enabled Personalized Augmentative and Alternative Communications

This study proposes a roadmap for integrating biometric technologies into personalized Augmentative and Alternative Communication (AAC) systems by introducing concepts like the AAC biometric register, while highlighting through case studies that current AI accuracy in gesture and sign language recognition remains insufficient for practical applications and offering recommendations to bridge this gap.

S. Yanushkevich, E. Berepiki, P. Ciunkiewicz, V. Shmerko, G. Wolbring, R. Guest2026-03-09💻 cs

The People's Gaze: Co-Designing and Refining Gaze Gestures with General Users and Gaze Interaction Experts

This paper presents a two-phase methodology that combines co-design workshops with non-expert users and expert refinement to develop a grounded, intuitive set of 32 gaze gestures and design principles for hands-free interaction on gaze-enabled devices.

Yaxiong Lei, Xinya Gong, Shijing He, Yafei Wang, Mohamed Khamis, Juan Ye2026-03-09💻 cs

Enhancing Tool Calling in LLMs with the International Tool Calling Dataset

This paper introduces International Tool Calling (ITC), a large-scale, multilingual benchmark comprising over 3,500 real APIs and 17,000 tasks across 40 countries, designed to address the limitations of existing datasets by improving LLM robustness, cross-lingual generalization, and performance in realistic global tool-calling scenarios.

Zuoyu Zhang, Yancheng Zhu2026-03-09💻 cs

Human-Centered Ambient and Wearable Sensing for Automated Monitoring in Dementia Care: A Scoping Review

This scoping review maps the landscape of wearable and ambient sensing technologies for dementia care from 2015 to 2025, proposing five key human-centered implementation principles to guide the development of ethical, scalable, and autonomy-enhancing monitoring systems.

Mason Kadem, Sarah Masri, Anthea Innes, Rong Zheng2026-03-09💻 cs

CoEditor++: Instruction-based Visual Editing via Cognitive Reasoning

CoEditor++ is a training-free, cognitively structured framework that leverages a two-stage "what-to-edit" and "how-to-edit" reasoning process with self-reflection to achieve state-of-the-art, visually consistent, and interpretable instruction-based image editing using only open-source components.

Minheng Ni, Yutao Fan, Zhengyuan Yang, Yeli Shen, Yuxiang Wei, Yaowen Zhang, Lijuan Wang, Lei Zhang, Wangmeng Zuo2026-03-09💻 cs

Ecosystem Trust Profiles

This paper introduces "ecosystem trust profiles" as a method for digital ecosystems to autonomously define and advertise trusted credentials, demonstrating how this framework enables cross-ecosystem interoperability while preserving sovereignty, though it reveals that such trust remains fragile without additional external governance mechanisms.

Christoph F. Strnadl2026-03-09💻 cs

ProFocus: Proactive Perception and Focused Reasoning in Vision-and-Language Navigation

ProFocus is a training-free framework that enhances Vision-and-Language Navigation by unifying proactive perception, which generates targeted visual queries to fill information gaps, and focused reasoning, which utilizes Branch-Diverse Monte Carlo Tree Search to prioritize high-value historical contexts, thereby achieving state-of-the-art zero-shot performance on R2R and REVERIE benchmarks.

Wei Xue, Mingcheng Li, Xuecheng Wu, Jingqun Tang, Dingkang Yang, Lihua Zhang2026-03-09💻 cs

Edges Are All You Need: Robust Gait Recognition via Label-Free Structure

This paper introduces SKETCHGAIT, a robust gait recognition framework that leverages a novel label-free "SKETCH" modality to extract dense structural cues from RGB images, demonstrating that combining this edge-based representation with traditional parsing methods significantly outperforms existing silhouette- and parsing-based approaches.

Chao Zhang, Zhuang Zheng, Ruixin Li, Zhanyong Mei2026-03-09💻 cs

Privacy-Preserving Collaborative Medical Image Segmentation Using Latent Transform Networks

This paper introduces PPCMI-SF, a privacy-preserving collaborative framework that utilizes client-specific latent transforms and server-side mapping to achieve high-accuracy, real-time medical image segmentation across heterogeneous institutions while effectively resisting inversion and membership inference attacks without sharing raw data.

Saheed Ademola Bello, Muhammad Shahid Jabbar, Muhammad Sohail Ibrahim, Shujaat Khan2026-03-09💻 cs

Digital-Twin Losses for Lane-Compliant Trajectory Prediction at Urban Intersections

This paper presents a digital twin-driven V2X trajectory prediction framework for urban intersections that employs a novel twin loss function alongside standard MSE to enforce traffic rules, collision avoidance, and motion diversity, thereby significantly reducing safety violations while maintaining high prediction accuracy and real-time performance.

Kuo-Yi Chao, Erik Leo Haß, Melina Gegg, Jiajie Zhang, Ralph Raßhofer, Alois Christian Knoll2026-03-09💻 cs

AutothinkRAG: Complexity-Aware Control of Retrieval-Augmented Reasoning for Image-Text Interaction

AutoThinkRAG is a complexity-aware framework for image-text interaction that improves document question answering by routing queries based on difficulty and decoupling visual interpretation from logical reasoning to achieve state-of-the-art performance with reduced inference costs.

Jiashu Yang, Chi Zhang, Abudukelimu Wuerkaixi, Xuxin Cheng, Cao Liu, Ke Zeng, Xu Jia, Xunliang Cai2026-03-09💻 cs

← Previous Next →