Replayable Financial Agents: A Determinism-Faithfulness Assurance Harness for Tool-Using LLM Agents
This paper introduces the Determinism-Faithfulness Assurance Harness (DFAH), a framework and set of financial benchmarks demonstrating that decision determinism and task accuracy in LLM agents are uncorrelated, thereby necessitating independent measurement to ensure reliable regulatory audit replay in financial services.