Plant-level Productivity and Imputation of Missing Data in U.S. Census Manufacturing Data
Within-industry differences in measured plant-level productivity are large. A large literature has been devoted to explaining the causes and consequences of these differences. In the U.S. Census Bureau's manufacturing data, the Bureau imputes for missing values using methods known to result in underestimation of variability and potential bias in multivariate inferences. We present an alternative strategy for handling the missing data based on multiple imputation via sequences of classification and regression trees. We use our imputations and the Bureau's imputations to estimate within-industry productivity dispersions. The results suggest that there is more within-industry productivity dispersion than previous research has indicated. We also estimate relationships between productivity and market structure and between output prices, capital, and the probability of plant exit (controlling for productivity) based on the improved imputations. For some estimands, we find substantially different results than those based on the Census Bureau's imputations.