Ensembling Language Models with Sequential Monte Carlo
This paper introduces a unified framework for ensembling diverse language models via -ensemble distributions and proposes a byte-level sequential Monte Carlo algorithm to sample from these distributions, effectively overcoming challenges like mismatched vocabularies and biased approximations to improve performance on structured text generation tasks.