Code-Space Response Oracles: Generating Interpretable Multi-Agent Policies with Large Language Models
This paper introduces Code-Space Response Oracles (CSRO), a novel framework that replaces black-box deep reinforcement learning oracles with Large Language Models to generate human-readable, interpretable multi-agent policies as code, achieving competitive performance while enabling the discovery of complex, explainable strategies.