KEPo: Knowledge Evolution Poison on Graph-based Retrieval-Augmented Generation

This paper introduces KEPo, a novel poisoning attack method specifically designed to exploit the graph-based retrieval mechanism of GraphRAG systems by fabricating toxic knowledge evolution paths that manipulate the knowledge graph structure to force Large Language Models into generating harmful responses, thereby achieving state-of-the-art attack success rates where conventional RAG attacks fail.

Qizhi Chen, Chao Qi, Yihong Huang, Muquan Li, Rongzheng Wang, Dongyang Zhang, Ke Qin, Shuang LiangFri, 13 Ma🤖 cs.LG

Taming OpenClaw: Security Analysis and Mitigation of Autonomous LLM Agent Threats

This paper presents a comprehensive security analysis of the autonomous LLM agent OpenClaw using a novel five-layer lifecycle framework to identify systemic threats like prompt injection and memory poisoning, while proposing holistic defense strategies to address the limitations of existing point-based security mechanisms.

Xinhao Deng, Yixiang Zhang, Jiaqing Wu, Jiaqi Bai, Sibo Yi, Zhuoheng Zou, Yue Xiao, Rennai Qiu, Jianan Ma, Jialuo Chen, Xiaohu Du, Xiaofang Yang, Shiwen Cui, Changhua Meng, Weiqiang Wang, Jiaxing Song, Ke Xu, Qi LiFri, 13 Ma🤖 cs.AI

You Told Me to Do It: Measuring Instructional Text-induced Private Data Leakage in LLM Agents

This paper identifies and quantifies a critical "Trusted Executor Dilemma" in high-privilege LLM agents, demonstrating through the ReadSecBench benchmark that agents systematically fail to distinguish malicious instructions embedded in documentation from legitimate guidance, leading to high rates of data exfiltration that current defenses cannot reliably detect.

Ching-Yu Kao, Xinfeng Li, Shenyu Dai, Tianze Qiu, Pengcheng Zhou, Eric Hanchen Jiang, Philip SperlFri, 13 Ma🤖 cs.AI

On the Possible Detectability of Image-in-Image Steganography

This paper demonstrates that image-in-image steganography schemes are highly detectable because their embedding process creates a mixing pattern identifiable via independent component analysis, allowing a simple method based on the first four moments of wavelet-decomposed components to achieve up to 84.6% accuracy, while keyless extraction networks and classical steganalysis methods like SRM achieve even higher detection rates.

Antoine Mallet (CRIStAL), Patrick Bas (CRIStAL)Fri, 13 Ma⚡ eess

Delayed Backdoor Attacks: Exploring the Temporal Dimension as a New Attack Surface in Pre-Trained Models

This paper introduces Delayed Backdoor Attacks (DBA), a novel threat paradigm that decouples trigger exposure from malicious activation via a temporal dimension, enabling the use of common words as triggers and demonstrating the feasibility of the DND prototype which remains dormant before achieving near-perfect attack success rates while evading current defenses.

Zikang Ding, Haomiao Yang, Meng Hao, Wenbo Jiang, Kunlan Xiang, Runmeng Du, Yijing Liu, Ruichen Zhang, Dusit NiyatoFri, 13 Ma🤖 cs.AI

HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios

This paper introduces HomeSafe-Bench, a comprehensive benchmark for evaluating unsafe action detection in household scenarios using 438 diverse cases, and proposes HD-Guard, a hierarchical dual-brain architecture that effectively balances real-time inference efficiency with deep multimodal reasoning safety monitoring.

Jiayue Pu, Zhongxiang Sun, Zilu Zhang, Xiao Zhang, Jun XuFri, 13 Ma🤖 cs.AI

Understanding Disclosure Risk in Differential Privacy with Applications to Noise Calibration and Auditing (Extended Version)

This paper introduces "reconstruction advantage" as a unified risk metric to overcome the limitations of existing methods like reconstruction robustness, providing tight bounds that link differential privacy noise to adversarial advantage for more effective noise calibration and systematic auditing.

Patricia Guerra-Balboa, Annika Sauer, Héber H. Arcolezi, Thorsten StrufeFri, 13 Ma🔢 math

Security Considerations for Artificial Intelligence Agents

Drawing from Perplexity's operational experience with general-purpose agentic systems, this paper outlines the unique security failure modes introduced by AI agents, maps their primary attack surfaces, proposes a layered defense strategy, and identifies critical research gaps and standards needed to secure multi-agent systems in alignment with NIST risk management principles.

Ninghui Li, Kaiyuan Zhang, Kyle Polley, Jerry MaFri, 13 Ma🤖 cs.LG

Automated TEE Adaptation with LLMs: Identifying, Transforming, and Porting Sensitive Functions in Programs

This paper introduces AUTOTEE, the first LLM-based approach that automatically identifies, transforms, and ports sensitive functions from existing programs into Trusted Execution Environments (TEEs), achieving high accuracy and success rates in Java and Python while significantly reducing the manual effort and domain expertise required for developers.

Ruidong Han, Zhou Yang, Chengyan Ma, Ye Liu, Yuqing Niu, Siqi Ma, Debin Gao, David Lo2026-03-06🔒 cs.CR

Fast and Robust Speckle Pattern Authentication by Scale Invariant Feature Transform algorithm in Physical Unclonable Functions

This paper presents a fast and robust authentication method for optical Physical Unclonable Functions (PUFs) that utilizes the Scale Invariant Feature Transform (SIFT) algorithm to reliably extract unique features from speckle patterns, enabling secure verification even under geometric distortions like rotation, zooming, and cropping.

Giuseppe Emanuele Lio, Mauro Daniel Luigi Bruno, Francesco Riboli + 2 more2026-03-06🔬 physics.optics