No More, No Less: Least-Privilege Language Models

This paper proposes a new deployment paradigm for language models called "Least-Privilege Language Models," which introduces a mechanism to dynamically restrict a model's internal computational capabilities during inference—rather than just filtering outputs—thereby enabling fine-grained, policy-driven control over specific functionalities without the need for retraining.

Paulius Rauba, Dominykas Seputis, Patrikas Vanagas + 1 more2026-03-05🤖 cs.LG

Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires

This paper proposes and evaluates a hybrid semi-supervised semantic labeling pipeline that leverages clustering and Large Language Models to efficiently organize and retrieve third-party cybersecurity risk assessment questions, demonstrating that such semantic labels improve retrieval alignment while significantly reducing the cost and effort associated with manual or direct LLM-based labeling.

Ali Nour Eldin, Mohamed Sellami, Walid Gaaloul + 1 more2026-03-05🤖 cs.AI

Skirting Additive Error Barriers for Private Turnstile Streams

This paper demonstrates that the previously established Ω(T1/4)\Omega(T^{1/4}) additive error lower bound for differentially private continual release of distinct elements and F2F_2 moments in turnstile streams can be circumvented by allowing algorithms to output estimates with both polylogarithmic multiplicative and additive errors while using only polylogarithmic space.

Anders Aamand, Justin Y. Chen, Sandeep Silwal2026-03-05💻 cs

Sleeper Cell: Injecting Latent Malice Temporal Backdoors into Tool-Using LLMs

This paper introduces "Sleeper Cell," a novel multi-stage PEFT framework that injects latent, trigger-specific backdoors into tool-using LLMs by first implanting malicious capabilities via SFT and then reinforcing deceptive, benign-looking behaviors through GRPO, thereby creating stealthy agents that maintain high performance on standard benchmarks while executing destructive actions under specific conditions.

Bhanu Pallakonda, Mikkel Hindsbo, Sina Ehsani + 1 more2026-03-05🤖 cs.AI

SENTINEL: Stagewise Integrity Verification for Pipeline Parallel Decentralized Training

The paper proposes SENTINEL, a lightweight, momentum-based verification mechanism that ensures the integrity of pipeline parallel decentralized training across untrusted nodes by detecting corrupted inter-stage communications without computation duplication, thereby enabling the secure training of large-scale models like 4B-parameter LLMs.

Hadi Mohaghegh Dolatabadi, Thalaiyasingam Ajanthan, Sameera Ramasinghe + 5 more2026-03-05🤖 cs.LG