The Surrogate Index: Combining Short-Term Proxies to Estimate Long-Term Treatment Effects More Rapidly and Precisely
A common challenge in estimating the impact of interventions (e.g., job training programs, educational programs) is that many outcomes of interest (e.g., lifetime earnings or other labor market outcomes) are observed with a long delay. In biomedical settings this is often addressed by using short-term outcomes as so-called “surrogates” for the outcome of interest, e.g., tumor size as a surrogate for mortality in cancer studies. We build on this literature by combining multiple, possibly qualitatively distinct, short-term outcomes (e.g., short-run earnings and employment indicators) systematically into a “surrogate index.” Under the Prentice surrogacy assumption, which requires that the primary outcome is independent of the treatment conditional on the surrogates, we show that the average treatment effect on the surrogate index equals the treatment effect on the long-term outcome. We also relate the surrogacy assumption to a set of structural, causal assumptions. We then characterize the bias that arises from violations of each of the key assumptions, and we provide simple methods to validate these assumptions using additional observed outcomes. We apply our method to analyze the long-term impacts of a multi-site job training experiment in California. Rather than waiting a full nine years to directly observe the long-term impact, we show that it is possible to use short-term (the first six quarters) outcomes as surrogates. One could have estimated the program’s long-term impacts on mean employment rates using the employment rates observed in the first six quarters, with a 35% reduction in standard errors.