Towards a more efficient bias detection in financial language models
This paper proposes a cost-effective approach to detecting bias in financial language models by leveraging cross-model patterns to identify bias-revealing inputs early, demonstrating that up to 73% of a model's biased behaviors can be uncovered using only 20% of the input pairs when guided by another model's outputs.