Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models

This paper introduces Think-as-You-See (TaYS), a unified framework that enables concurrent, streaming Chain-of-Thought reasoning for Large Vision-Language Models by decoupling visual encoding from textual reasoning, thereby outperforming traditional batch and interleaved approaches in both accuracy and latency for real-time video understanding.

Jialiang Zhang, Junlong Tong, Junyan Lin, Hao Wu, Yirong Sun, Yunpu Ma, Xiaoyu Shen2026-03-09💻 cs

Efficient Query Rewrite Rule Discovery via Standardized Enumeration and Learning-to-Rank(extend)

This paper presents SLER, a scalable system that combines standardized template enumeration with a learning-to-rank model to overcome the exponential search space and redundancy challenges of existing methods, successfully discovering over one million high-quality query rewrite rules for complex query plans.

Yuan Zhang, Yuxing Chen, Yuekun Yu, Jinbin Huang, Rui Mao, Anqun Pan, Lixiong Zheng, Jianbin Qin2026-03-09💻 cs

Publication and Maintenance of Relational Data in Enterprise Knowledge Graphs (Revised Version)

This paper proposes a formal framework, architecture, and algorithms for constructing and incrementally maintaining materialized RDB2RDF views to enable efficient, semantically integrated access to legacy relational data within Enterprise Knowledge Graphs.

Vânia Maria Ponte Vidal (Departamento de Computação, UFC, Fortaleza, Brazil), Valéria Magalhães Pequeno (TechLab, Departamento de Ciências e Tecnologias, UAL, Lisboa, Portugal), Marco Antonio Casanova (Instituto Tecgraf, Puc-Rio, Rio de Janeiro, Brazil), Narciso Arruda (Departamento de Computação, UFC, Fortaleza, Brazil), Carlos Brito (Departamento de Computação, UFC, Fortaleza, Brazil)2026-03-09💻 cs

Biometric-enabled Personalized Augmentative and Alternative Communications

This study proposes a roadmap for integrating biometric technologies into personalized Augmentative and Alternative Communication (AAC) systems by introducing concepts like the AAC biometric register, while highlighting through case studies that current AI accuracy in gesture and sign language recognition remains insufficient for practical applications and offering recommendations to bridge this gap.

S. Yanushkevich, E. Berepiki, P. Ciunkiewicz, V. Shmerko, G. Wolbring, R. Guest2026-03-09💻 cs

ProFocus: Proactive Perception and Focused Reasoning in Vision-and-Language Navigation

ProFocus is a training-free framework that enhances Vision-and-Language Navigation by unifying proactive perception, which generates targeted visual queries to fill information gaps, and focused reasoning, which utilizes Branch-Diverse Monte Carlo Tree Search to prioritize high-value historical contexts, thereby achieving state-of-the-art zero-shot performance on R2R and REVERIE benchmarks.

Wei Xue, Mingcheng Li, Xuecheng Wu, Jingqun Tang, Dingkang Yang, Lihua Zhang2026-03-09💻 cs

Privacy-Preserving Collaborative Medical Image Segmentation Using Latent Transform Networks

This paper introduces PPCMI-SF, a privacy-preserving collaborative framework that utilizes client-specific latent transforms and server-side mapping to achieve high-accuracy, real-time medical image segmentation across heterogeneous institutions while effectively resisting inversion and membership inference attacks without sharing raw data.

Saheed Ademola Bello, Muhammad Shahid Jabbar, Muhammad Sohail Ibrahim, Shujaat Khan2026-03-09💻 cs

Digital-Twin Losses for Lane-Compliant Trajectory Prediction at Urban Intersections

This paper presents a digital twin-driven V2X trajectory prediction framework for urban intersections that employs a novel twin loss function alongside standard MSE to enforce traffic rules, collision avoidance, and motion diversity, thereby significantly reducing safety violations while maintaining high prediction accuracy and real-time performance.

Kuo-Yi Chao, Erik Leo Haß, Melina Gegg, Jiajie Zhang, Ralph Raßhofer, Alois Christian Knoll2026-03-09💻 cs