Advances in science and innovation are the number one source of sustained improvements in individual well-being and economic growth. This project examines the origins of American leadership in science and innovation in the 20th century, starting with three major questions: 1) How has socioeconomic inequality influenced participation in American science? 2) How does a person’s socioeconomic status influence their research output and the recognition they receive conditional on becoming scientists? 3) And what is the role of universities in encouraging broad-based participation?
A major challenge for empirical analyses of science lies in the absence of systematic long-rung data on successful scientists, including their socioeconomic background, education, work histories, inventions, and publications. This research has constructed such data for more than 100,000 American scientists who were born between 1817 and 1933. We have developed machine-learning algorithms to link each scientist with their census records, patents, and publications. Using these data, we first show that children from low-socioeconomic status (SES) families are underrepresented in science. Scientists from high-SES families are more likely to attend elite universities and they publish more. Moreover, we find that a person’s social background influences our perception of their ability and contributions, even conditional on their education and publications.
Code created for this project is posted on Github.