Best Linear Approximations to Set Identified Functions: With an Application to the Gender Wage Gap
This paper provides inference methods for best linear approximations to functions which are known to lie within a band. It extends the partial identification literature by allowing the upper and lower functions defining the band to carry an index, and to be unknown but parametrically or non-parametrically estimable functions. The identification region of the parameters of the best linear approximation is characterized via its support function, and limit theory is developed for the latter. We prove that the support function can be approximated by a Gaussian process and establish validity of the Bayesian bootstrap for inference. Because the bounds may carry an index, the approach covers many canonical examples in the partial identification literature arising in the presence of interval valued outcome and/or regressor data: not only mean regression, but also quantile and distribution regression, including sample selection problems, as well as mean, quantile, and distribution treatment effects. In addition, the framework can account for the availability of instruments. An application is carried out, studying female labor force participation using data from Mulligan and Rubinstein (2008) and insights from Blundell, Gosling, Ichimura, and Meghir (2007). Our results yield robust evidence of a gender wage gap, both in the 1970s and 1990s, at quantiles of the wage distribution up to the 0.4, while allowing for completely unrestricted selection into the labor force. Under the assumption that the median wage offer of the employed is larger than that of individuals that do not work, the evidence of a gender wage gap extends to quantiles up to the 0.7. When the assumption is further strengthened to require stochastic dominance, the evidence of a gender wage gap extends to all quantiles, and there is some evidence at the 0.8 and higher quantiles that the gender wage gap decreased between the 1970s and 1990s.