cs papers | Gist.Science

VisualAD: Language-Free Zero-Shot Anomaly Detection via Vision Transformer

VisualAD is a language-free, zero-shot anomaly detection framework that leverages a frozen Vision Transformer backbone with learnable normality and abnormality tokens, along with spatial-aware cross-attention and self-alignment modules, to achieve state-of-the-art performance across industrial and medical domains without relying on text encoders or cross-modal alignment.

Yanning Hou, Peiyuan Li, Zirui Liu, Yitong Wang, Yanran Ruan, Jianfeng Qiu, Ke Xu2026-03-10💻 cs

The Sense of Misinformation Can Harm Local Community: A Case Study of Community Conflict

This paper introduces the concept of "sense of misinformation"—the mistaken perception that truthful information is false—and demonstrates through a casino proposal case study how this phenomenon, driven by governance miscoordination and communication breakdowns, erodes community trust and democracy, while proposing design strategies to mitigate its harmful effects.

Jiyoon Kim, Jie Cai, Srishti Gupta, John M. Carroll2026-03-10💻 cs

From Daily Song to Daily Self: Supporting Reflective Songwriting of Deaf and Hard-of-Hearing Individuals through Generative Music AI

This paper presents SoulNote, a generative AI system designed to support Deaf and Hard-of-Hearing individuals in engaging in iterative, multi-session songwriting as a reflective journaling practice that fosters emotional growth through self-insight, emotion regulation, and improved self-care attitudes.

Youjin Choi, Jinyoung Yoo, Jaeyoung Moon, Yoonjae Kim, Eun Young Lee, Jennifer G. Kim, Jin-Hyuk Hong2026-03-10💻 cs

WeldAR: Augmenting Live Hands-On Training with In-Situ Guidance for Novice Learners

The paper presents WeldAR, an Augmented Reality system integrated into welding equipment that provides real-time in-situ guidance to novices, demonstrating through a user study that it significantly improves welding performance and the transfer of embodied skills compared to traditional video instruction.

Chuhan (Franklin), Xu (Carnegie Mellon University), Lia Sparingga Purnamasari (Carnegie Mellon University), Zhenfang Chen (Carnegie Mellon University), Daragh Byrne (Carnegie Mellon University), Dina El-Zanfaly (Carnegie Mellon University)2026-03-10✓ Author reviewed ⓘ💻 cs

SGG-R $^{\rm 3}$ : From Next-Token Prediction to End-to-End Unbiased Scene Graph Generation

The paper introduces SGG-R $^{\rm 3}$ , a structured reasoning framework that combines chain-of-thought-guided supervised fine-tuning with relation augmentation and a novel dual-granularity reward scheme in reinforcement learning to achieve end-to-end unbiased Scene Graph Generation with improved recall and reduced bias on long-tailed distributions.

Jiaye Feng, Qixiang Yin, Yuankun Liu, Tong Mo, Weiping Li2026-03-10💻 cs

GOMA: Geometrically Optimal Mapping via Analytical Modeling for Spatial Accelerators

This paper presents GOMA, a geometrically abstracted, globally optimal framework that uses analytical modeling to efficiently solve the combinatorial GEMM mapping problem for spatial accelerators, achieving significant improvements in energy-delay product and search speed over state-of-the-art methods.

Wulve Yang, Hailong Zou, Rui Zhou, Jionghao Zhang, Qiang Li, Gang Li, Yi Zhan, Shushan Qiao2026-03-10💻 cs

Designing a Generative AI-Assisted Music Psychotherapy Tool for Deaf and Hard-of-Hearing Individuals

This paper presents a co-designed generative AI tool that enables Deaf and Hard-of-Hearing individuals to engage in music psychotherapy through visual and conversational songwriting, demonstrating that collaborative human-AI interaction can effectively facilitate emotional release and self-understanding for this underserved population.

Youjin Choi, Jaeyoung Moon, Jinyoung Yoo, Jennifer G. Kim, Jin-Hyuk Hong2026-03-10💻 cs

Model-Free DRL Control for Power Inverters: From Policy Learning to Real-Time Implementation via Knowledge Distillation

This paper proposes a model-free Deep Reinforcement Learning control framework for power inverters that utilizes an error energy-guided hybrid reward and adaptive importance weighting to distill a heavy policy into a lightweight neural network, achieving microsecond-level inference, superior transient response, and robust performance on a hardware experimental platform.

Yang Yang, Chenggang Cui, Xitong Niu, Jiaming Liu, Chuanlin Zhang2026-03-10💻 cs

Listening with the Eyes: Benchmarking Egocentric Co-Speech Grounding across Space and Time

This paper introduces EcoG-Bench, a rigorous bilingual benchmark for egocentric co-speech grounding that reveals a significant performance gap between humans and state-of-the-art MLLMs, highlighting how multimodal interface limitations rather than reasoning deficits hinder the alignment of speech with pointing gestures in situated collaboration.

Weijie Zhou, Xuantang Xiong, Zhenlin Hu, Xiaomeng Zhu, Chaoyang Zhao, Honghui Dong, Zhengyou Zhang, Ming Tang, Jinqiao Wang2026-03-10💻 cs

Advancing Automated Algorithm Design via Evolutionary Stagewise Design with LLMs

This paper introduces EvoStage, a novel evolutionary paradigm that leverages large language models with a stagewise, multi-agent approach and real-time feedback to overcome the limitations of black-box modeling, successfully generating algorithm designs that outperform both human experts and existing methods in complex industrial tasks like chip placement and black-box optimization.

Chen Lu, Ke Xue, Chengrui Gao, Yunqi Shi, Siyuan Xu, Mingxuan Yuan, Chao Qian, Zhi-Hua Zhou2026-03-10💻 cs

Adaptive Collaboration with Humans: Metacognitive Policy Optimization for Multi-Agent LLMs with Continual Learning

This paper introduces HILA, a Human-In-the-Loop Multi-Agent Collaboration framework that employs Dual-Loop Policy Optimization to train agents with metacognitive policies for dynamically deferring to human experts and continuously improving their reasoning capabilities, thereby overcoming the static knowledge limitations of purely autonomous systems.

Wei Yang, Defu Cao, Jiacheng Pang, Muyan Weng, Yan Liu2026-03-10💻 cs

VORL-EXPLORE: A Hybrid Learning Planning Approach to Multi-Robot Exploration in Dynamic Environments

VORL-EXPLORE is a hybrid learning and planning framework for multi-robot exploration in dynamic environments that couples task allocation with motion execution via a shared navigability fidelity signal, enabling adaptive arbitration between global and reactive policies to prevent bottlenecks and ensure robust, collision-free coverage.

Ning Liu, Sen Shen, Zheng Li, Sheng Liu, Dongkun Han, Shangke Lyu, Thomas Braunl2026-03-10💻 cs

ZK-ACE: Identity-Centric Zero-Knowledge Authorization for Post-Quantum Blockchain Systems

ZK-ACE is a post-quantum authorization framework that replaces kilobyte-scale signature artifacts with compact, identity-bound zero-knowledge proofs, achieving an order-of-magnitude reduction in on-chain data while providing formal security guarantees against replay and substitution attacks.

Jian Sheng Wang2026-03-10💻 cs

OSExpert: Computer-Use Agents Learning Professional Skills via Exploration

The paper introduces OSExpert, a computer-use agent that leverages a GUI-based depth-first search exploration algorithm to discover action primitives and self-construct a skill curriculum, thereby significantly improving performance and efficiency on complex tasks to approach human expert levels.

Jiateng Liu, Zhenhailong Wang, Rushi Wang, Bingxuan Li, Jeonghwan Kim, Aditi Tiwari, Pengfei Yu, Denghui Zhang, Heng Ji2026-03-10💻 cs

Extend Your Horizon: A Device-Agnostic Surgical Tool Tracking Framework with Multi-View Optimization for Augmented Reality

This paper presents a device-agnostic surgical tool tracking framework that fuses multiple sensing modalities within a dynamic scene graph to overcome line-of-sight occlusions and enhance the robustness of augmented reality visualization in operating rooms.

Jiaming Zhang, Mingxu Liu, Hongchao Shu, Ruixing Liang, Yihao Liu, Ojas Taskar, Amir Kheradmand, Mehran Armand, Alejandro Martin-Gomez2026-03-10💻 cs

ACE-GF-based Attestation Relay for PQC - Lightweight Mempool Propagation Without On-Path Proofs

The paper introduces AR-ACE, a lightweight mempool propagation protocol for post-quantum blockchains that eliminates on-path validity proofs by having relay nodes forward only objects with compact attestations, thereby achieving an order-of-magnitude bandwidth reduction while shifting proof verification entirely to the builder.

Jian Sheng Wang2026-03-10💻 cs

Energy-Efficient Online Scheduling for Wireless Powered Mobile Edge Computing Networks

This paper proposes an energy-efficient online scheduling framework for Wireless Powered Mobile Edge Computing networks that utilizes Lyapunov optimization and a relax-then-adjust approach to solve the joint wireless power transfer and computation offloading problem, achieving a fundamental trade-off between latency and energy consumption while ensuring theoretical performance guarantees.

Xingqiu He, Chaoqun You, Yuzhi Yang, Zihan Chen, Yuhang Shen, Tony Q. S. Quek, Yue Gao2026-03-10💻 cs

On the Feasibility and Opportunity of Autoregressive 3D Object Detection

The paper introduces AutoReg3D, an autoregressive 3D object detector that reformulates LiDAR-based detection as a sequence generation task using a near-to-far ordering to eliminate reliance on hand-crafted components like anchors and NMS, thereby achieving competitive performance while enabling the integration of advanced language model techniques such as reinforcement learning.

Zanming Huang, Jinsu Yoo, Sooyoung Jeon, Zhenzhen Liu, Mark Campbell, Kilian Q Weinberger, Bharath Hariharan, Wei-Lun Chao, Katie Z Luo2026-03-10💻 cs

PreHO: Predictive Handover for LEO Satellite Networks

This paper proposes PreHO, a predictive handover mechanism for Low-Earth Orbit Satellite Networks that leverages the stable and predictable channel states of fast-moving satellites to proactively plan optimal handover strategies, thereby significantly reducing signaling overhead and latency compared to traditional reactive schemes.

Xingqiu He, Zijie Ying, Chaoqun You, Yue Gao2026-03-10💻 cs

TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

TeamHOI is a decentralized framework that leverages a Transformer-based policy and a masked Adversarial Motion Prior strategy to enable a single unified policy to control scalable, physically realistic cooperative human-object interactions among any number of humanoid agents.

Stefan Lionar, Gim Hee Lee2026-03-10💻 cs

← Previous Next →

cs