Governing Evolving Memory in LLM Agents: Risks, Mechanisms, and the Stability and Safety Governed Memory (SSGM) Framework

This paper introduces the Stability and Safety-Governed Memory (SSGM) framework to address critical risks like memory corruption, semantic drift, and privacy vulnerabilities in evolving LLM agents by decoupling memory evolution from execution through consistency verification, temporal decay modeling, and dynamic access control.

Chingkwun Lam, Jiaxin Li, Lingfei Zhang, Kuo Zhao2026-03-13🤖 cs.AI

An Automatic Text Classification Method Based on Hierarchical Taxonomies, Neural Networks and Document Embedding: The NETHIC Tool

This paper presents NETHIC, an automatic text classification tool that combines scalable neural networks with hierarchical taxonomies and document embeddings to achieve significant improvements in both effectiveness and efficiency across generic and domain-specific corpora.

Luigi Lomasto, Rosario Di Florio, Andrea Ciapetti, Giuseppe Miscione, Giulia Ruggiero, Daniele Toti2026-03-13🤖 cs.AI

DocSage: An Information Structuring Agent for Multi-Doc Multi-Entity Question Answering

DocSage is an end-to-end agentic framework that addresses the limitations of existing RAG systems in multi-document, multi-entity question answering by integrating dynamic schema discovery, error-aware structured extraction, and schema-aware relational reasoning to significantly improve cross-document evidence aggregation and accuracy.

Teng Lin, Yizhang Zhu, Zhengxuan Zhang, Yuyu Luo, Nan Tang2026-03-13🤖 cs.AI

Automating Skill Acquisition through Large-Scale Mining of Open-Source Agentic Repositories: A Framework for Multi-Agent Procedural Knowledge Extraction

This paper presents a framework for automating the acquisition of specialized procedural agent skills by systematically mining open-source repositories to extract, standardize, and evaluate capabilities like mathematical visualization, demonstrating that such methods can significantly enhance LLM performance in autonomous workflows without requiring model retraining.

Shuzhen Bi, Mengsong Wu, Hao Hao, Keqian Li, Wentao Liu, Siyu Song, Hongbo Zhao, Aimin Zhou2026-03-13🤖 cs.AI

RADAR: Closed-Loop Robotic Data Generation via Semantic Planning and Autonomous Causal Environment Reset

RADAR is a fully autonomous, closed-loop robotic data generation framework that leverages a four-module pipeline—combining vision-language semantic planning, graph neural network policies, automated success evaluation, and a causal state-machine reset mechanism—to overcome human-in-the-loop bottlenecks and achieve high success rates in both simulation and real-world complex manipulation tasks.

Yongzhong Wang, Keyu Zhu, Yong Zhong, Liqiong Wang, Jinyu Yang, Feng Zheng2026-03-13🤖 cs.AI

VisiFold: Long-Term Traffic Forecasting via Temporal Folding Graph and Node Visibility

The paper proposes VisiFold, a novel framework that addresses the computational and dependency challenges of long-term traffic forecasting by introducing a temporal folding graph to consolidate temporal snapshots and a node visibility mechanism to efficiently handle large-scale spatial data, thereby significantly reducing resource consumption while outperforming existing baselines.

Zhiwei Zhang, Xinyi Du, Weihao Wang, Xuanchi Guo, Wenjuan Han2026-03-13🤖 cs.AI

Automated Detection of Malignant Lesions in the Ovary Using Deep Learning Models and XAI

This research utilizes various Convolutional Neural Network architectures and Explainable AI techniques on a histopathology dataset to develop and evaluate an InceptionV3 model that achieves 94% accuracy in the automated detection of malignant ovarian lesions, aiming to improve non-invasive diagnostic procedures.

Md. Hasin Sarwar Ifty, Nisharga Nirjan, Labib Islam, M. A. Diganta, Reeyad Ahmed Ornate, Anika Tasnim, Md. Saiful Islam2026-03-13🤖 cs.AI

You Told Me to Do It: Measuring Instructional Text-induced Private Data Leakage in LLM Agents

This paper identifies and quantifies a critical "Trusted Executor Dilemma" in high-privilege LLM agents, demonstrating through the ReadSecBench benchmark that agents systematically fail to distinguish malicious instructions embedded in documentation from legitimate guidance, leading to high rates of data exfiltration that current defenses cannot reliably detect.

Ching-Yu Kao, Xinfeng Li, Shenyu Dai, Tianze Qiu, Pengcheng Zhou, Eric Hanchen Jiang, Philip Sperl2026-03-13🤖 cs.AI

CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges

This paper introduces CreativeBench, a novel benchmark for objectively evaluating machine creativity in code generation through a unified quality-novelty metric, and proposes EvoRePE, an inference-time strategy that leverages self-evolving patterns to enhance creative performance while revealing key insights into how model scaling affects different creativity types.

Zi-Han Wang, Lam Nguyen, Zhengyang Zhao, Mengyue Yang, Chengwei Qin, Yujiu Yang, Linyi Yang2026-03-13🤖 cs.AI

Social, Legal, Ethical, Empathetic and Cultural Norm Operationalisation for AI Agents

This paper proposes a systematic framework for operationalizing social, legal, ethical, empathetic, and cultural (SLEEC) norms into concrete, verifiable requirements for AI agents, while surveying current methods and outlining a research agenda to bridge the gap between abstract normative principles and practical implementation in high-stakes domains.

Radu Calinescu, Ana Cavalcanti, Marsha Chechik, Lina Marsso, Beverley Townsend2026-03-13🤖 cs.AI

AdaFuse: Accelerating Dynamic Adapter Inference via Token-Level Pre-Gating and Fused Kernel Optimization

AdaFuse is a framework that accelerates dynamic adapter inference in Large Language Models by employing a token-level pre-gating strategy to enable a single global routing decision, which is then executed via a custom fused CUDA kernel to reduce decoding latency by over 2.4x while maintaining accuracy.

Qiyang Li, Rui Kong, Yuchen Li, Hengyi Cai, Shuaiqiang Wang, Linghe Kong, Guihai Chen, Dawei Yin2026-03-13🤖 cs.AI

Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language

This paper introduces Bielik-Minitron-7B, a compressed 7.35B-parameter Polish language model created by applying structured pruning and knowledge distillation to the Bielik-11B-v3.0 model, which achieves a 33.4% parameter reduction and up to 50% inference speedup while retaining approximately 90% of the original model's performance.

Remigiusz Kinas, Paweł Kiszczak, Sergio P. Perez, Krzysztof Ociepa, Łukasz Flis, Krzysztof Wróbel, Adrian Gwozdziej2026-03-13💬 cs.CL