Choosing Wisely: Evaluating Latent Factor Models in the Presence of a Contaminated Instrumental Variable with Varying Strength
Causal inference methods are widely used in empirical research; however, there is a paucity of evidence on the properties of shared latent factor estimators in the presence of contaminated instrumental variable (IV) when a strong IV may not be available. We present a theoretical formulation to depict how the strength and degree of contamination of the IV simultaneously determine the optimal choice of estimator. We perform Monte Carlo simulations with four outcome variables and an endogenous treatment variable, with sample sizes of 1000 and 2000, and for 1000 iterations, to compare the finite sample properties of the OLS, 2SLS, Shared Latent Factor without IV (SLF), and Shared Latent Factor with IV (SLF+IV) estimators. Finally, we demonstrate the applicability of the proposed estimators to study the causal impact of maternal parity on various maternal and child health indicators: child’s height-for-age percentile, child’s weight-for-age percentile, child’s haemoglobin count, and mother’s haemoglobin count, using data from the 2019-21 Round 5 of the National Family Health Survey (NFHS-5) from India. Our simulation results indicate that for a given degree of contamination of the IV, there exists a threshold strength of the IV, such that the SLF+IV estimator has a lower (greater) bias than the SLF estimator when the strength of the IV lies above (below) that threshold. The empirical results suggest that a lower parity is associated with higher height-for-age and weight-for-age percentile and haemoglobin count in children and a higher haemoglobin count in mothers.