Variable Selection for Linear Regression Imputation in Surveys
This paper addresses the underexplored challenge of variable selection for linear regression imputation in survey data by defining an optimal model via an oracle loss function, analyzing the consequences of model misspecification, and proposing a methodological framework for constructing asymptotically valid and optimal confidence intervals.