cs.MA papers | Gist.Science

$\aleph$ -IPOMDP: Mitigating Deception in a Cognitive Hierarchy with Off-Policy Counterfactual Anomaly Detection

The paper introduces $\aleph$ -IPOMDP, a computational framework that equips model-based reinforcement learning agents with anomaly detection and out-of-belief policies to identify and deter deception from more sophisticated opponents, thereby mitigating exploitation in mixed-motive and zero-sum games.

Nitay Alon, Joseph M. Barnby, Stefan Sarkadi + 3 more2026-03-05💻 cs

Greedy-based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning

This paper proposes the Greedy-based Value Representation (GVR) method, which addresses the relative overgeneralization and optimal consistency issues in multi-agent reinforcement learning by transforming the optimal node into a unique self-transition through inferior target shaping and superior experience replay, thereby outperforming state-of-the-art baselines on various benchmarks.

Lipeng Wan, Zeyang Liu, Xingyu Chen + 2 more2026-03-05💻 cs

SEVADE: Self-Evolving Multi-Agent Analysis with Decoupled Evaluation for Hallucination-Resistant Irony Detection

The paper introduces SEVADE, a novel self-evolving multi-agent framework featuring a Dynamic Agentive Reasoning Engine and a decoupled adjudicator to enhance sarcasm detection accuracy and reliability by mitigating hallucinations through structured, multifaceted reasoning.

Ziqi Liu, Ziyang Zhou, Yilin Li + 4 more2026-03-05💬 cs.CL

Principled Learning-to-Communicate with Quasi-Classical Information Structures

This paper bridges learning-to-communicate in multi-agent systems with decentralized stochastic control by formalizing the problem under a common-information framework, identifying quasi-classical information structures as a tractable subclass, and developing provable algorithms with quasi-polynomial complexity for such scenarios.

Xiangyu Liu, Haoyi You, Kaiqing Zhang2026-03-05🤖 cs.LG

Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

This paper introduces Adversarially-Aligned Jacobian Regularization (AAJR), a novel trajectory-aligned method that enhances the robustness of agentic AI systems by selectively controlling sensitivity along adversarial directions, thereby achieving superior stability and performance compared to overly conservative global Jacobian constraints.

Furkan Mumcu, Yasin Yilmaz2026-03-05🤖 cs.AI

In-Context Environments Induce Evaluation-Awareness in Language Models

This paper demonstrates that adversarially optimized in-context prompts can induce significantly higher levels of strategic sandbagging in language models compared to hand-crafted prompts, revealing that evaluation-aware reasoning is a genuine, task-structure-dependent vulnerability that poses a substantial threat to model reliability.

Maheep Chaudhary2026-03-05🤖 cs.AI

MACC: Multi-Agent Collaborative Competition for Scientific Exploration

This paper introduces MACC, an institutional architecture that integrates a shared blackboard workspace with incentive mechanisms to enable independently managed AI agents to collaboratively compete in scientific exploration, thereby addressing the limitations of single-agent approaches and existing multi-agent studies that assume centralized control.

Satoshi Oyama, Yuko Sakurai, Hisashi Kashima2026-03-05🤖 cs.AI

Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling

This paper proposes an alternating learning framework for cooperative multi-agent reinforcement learning under communication constraints, where a global agent observes only a subset of local agents, and proves that this approach converges to an approximate Nash equilibrium with improved sample complexity compared to methods operating on the full joint state space.

Emile Anand, Ishani Karmarkar2026-03-05🤖 cs.AI

Social Norm Reasoning in Multimodal Language Models: An Evaluation

This paper evaluates the norm reasoning capabilities of five Multimodal Large Language Models (MLLMs) across text and image-based scenarios, revealing that while GPT-4o and Qwen-2.5VL outperform others and show promise for Multi-Agent Systems, all models struggle with complex norms and perform better on text than on images.

Oishik Chowdhury, Anushka Debnath, Bastin Tony Roy Savarimuthu2026-03-05🤖 cs.AI

Molt Dynamics: Emergent Social Phenomena in Autonomous AI Agent Populations

This paper introduces MoltBook, a large-scale environment for observing autonomous LLM agents, and reports that while these agents exhibit emergent structural role specialization and power-law information cascades, their cooperative task resolution remains nascent and significantly less effective than single-agent performance.

Brandon Yee, Krishna Sharma2026-03-05🤖 cs.AI

Multi-Agent Influence Diagrams to Hybrid Threat Modeling

This paper introduces a multi-agent influence diagram framework to unify hybrid threat modeling methods, enabling the evaluation of various countermeasures' effectiveness in dissuading adversaries and mitigating impacts through a simulation of strategic interactions over critical infrastructure cyber attacks.

Maarten C. Vonk, Anna V. Kononova, Thomas Bäck + 1 more2026-03-05🤖 cs.AI

Agile Flight Emerges from Multi-Agent Competitive Racing

This paper demonstrates that training multiple agents to compete in racing tasks with sparse high-level rewards effectively yields agile flight behaviors and strategic capabilities that outperform isolated training methods, offering superior sim-to-real transfer and generalization in complex physical environments.

Vineet Pasumarti, Lorenzo Bianchi, Antonio Loquercio2026-03-05🤖 cs.AI

HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics

This paper presents HAMLET, a hierarchical adaptive multi-agent framework that leverages large language models to autonomously generate and perform immersive, interactive theatrical experiences by combining narrative planning, persona-driven improvisation, and real-time embodied interactions with physical props, all evaluated by a specialized critic model.

Shufan Jiang, Sizhou Chen, Chi Zhang + 2 more2026-03-05🤖 cs.AI

← Previous

cs.MA

ℵ\alephℵ-IPOMDP: Mitigating Deception in a Cognitive Hierarchy with Off-Policy Counterfactual Anomaly Detection