LoRA-Ensemble: Efficient Uncertainty Modelling for Self-Attention Networks
The paper introduces LoRA-Ensemble, a parameter-efficient method that leverages Low-Rank Adaptation to create an implicit ensemble for self-attention networks, achieving superior calibration and accuracy comparable to explicit ensembles while significantly reducing computational and memory costs.