Can RL Improve Generalization of LLM Agents? An Empirical Study
This paper empirically demonstrates that while Reinforcement Fine-Tuning (RFT) enables LLM agents to generalize well across varying task difficulties within a single environment, it struggles with cross-environment transfer due to interface and semantic shifts, though sequential and mixture training strategies can effectively mitigate forgetting and improve overall generalization.