Using Genetic Markers to Measure the Effect of Health on Education

10/20/2009
Featured in print Bulletin on Aging & Health

Researchers have been keenly interested in the links between socioeconomic status and health going back at least to the famous Whitehall I study of British civil servants, which found that low-grade employees had a mortality rate three times that of their high-grade counterparts. Yet determining the exact nature of the relationship between education and health is notoriously difficult, since it is likely both that health affects educational outcomes and that education affects health outcomes. For example, while numerous studies report that students who are obese or depressed perform poorly relative to their classmates, it is far from certain that this represents a causal effect of health on education, as the causality could run from education to health or the correlation could be driven by a third factor.

In The Impact of Poor Health on Education: New Evidence Using Genetic Markers, (NBER Working Paper 12304), Weili Ding, Steven Lehrer, J. Niels Rosenquist, and Janet Audrain-McGovern introduce a novel approach to study this question.

Following the recent decoding of the human genome, a sequence of approximately three billion chemical "letters" that make up human DNA, there is a growing body of neuroscientific evidence identifying genetic markers that have strong associations with specific diseases and health behaviors. The authors use these genetic markers to predict children's health, then use the predicted health measures to estimate the effect of health on education. Because genetic markers are determined at conception, before the child is ex-posed to any other influences, the use of predicted health measures based solely on these markers allows the authors to identify a causal effect of health on education.

The data for the analysis comes from the Georgetown Adolescent Tobacco Research (GATOR) study, a unique longitudinal data set of Virginia high school students that combines data from questionnaires with genetic information. The sample size is about one thousand students.

The authors' first task is to show that there is a strong relationship between genetic markers and the health behaviors and out-comes they wish to predict. They focus on four genes identified as relevant by the scientific literature. For example, one gene (CYP2B6) is related to liver enzymes that break down toxins such as nicotine, while another (DRD2) is believed to determine the number of recep-tors for dopamine, a key chemical linked to the brain's reward system. Each individual inherits a single copy of each gene from each parent, so the individual may have two copies of the same gene or two different genes for each marker.

When the authors compare the health of students with different genetic markers, they find striking differences. For example, individuals with the rare TT form of the CYP gene are approximately 8.5 percentage points more likely to be diagnosed with inattention and hyperactivity than those with other forms of the CYP gene, while individuals with the common A2A2 form of the DRD2 gene are substantially less likely to be depressed or obese than those with other combinations. Overall, the four genetic markers have strong and significant associations with health behaviors and outcomes.

Next, the authors turn to estimating the effect of their predicted health measures on educational outcomes. For girls, depression and obesity have very substantial effects - each lowers a student's grade point average by 0.8 points (on a 4-point scale). The effect of attention deficit hyperactivity disorder for girls is near zero, though this masks a negative effect of inattention that is offset by a positive effect of hyperactivity. Interestingly, the results for boys are smaller and statistically insignificant. More research is needed to understand these gender differences.

As the authors test the validity of their model, two other important findings emerge. The first is the importance of treating smoking as an active choice made by the student, as the authors find that either ignoring smoking or treating it as exogenous leads to estimated effects of health on education that are implausibly large and sometimes wrong-signed. The second is the importance of accounting for the coexistence of health outcomes and behaviors, as the authors find that the estimated effect of one health condition such as depres-sion on education is generally overstated when other conditions such as obesity are omitted from the model.

The authors caution that their estimates reflect the total effect of genes on educational outcomes and might include dynastic effects. Specifically, they are unable to "disentangle the impact of the health condition as explained by genes from that of the response from the environment to the health conditions as explained by genes," such as how parents, teachers, and peers respond to a student's health status. They note that this limitation is shared by other strategies that treat genetics as part of a black box and conclude that substantially richer data would be needed to separately identify these different impacts.

In conclusion, the authors note that "recent years have witnessed an explosion of findings on the causes and correlates of health outcomes and behaviors in neurobiology, which could offer a promising source of predetermined exogenous variations to help identify the impact of health on a set of outcomes of great interest to economists," such as labor market activity, marriage, and educational attainment. Further, the use of data on genetic markers could also permit empirical "researchers to investigate whether nurture inputs or family characteristics can offset the impact of genetic predispositions."