Rigidity in LLM Bandits with Implications for Human-AI Dyads
This paper demonstrates that large language models exhibit robust decision biases in two-arm bandit tasks, characterized by stubborn exploitation and low learning rates that persist across decoding parameters, thereby posing significant challenges for optimal human-AI collaboration.