Measuring the Race, Ethnic, and Gender Composition of Company Workforces Using LinkedIn Data
Stronger enforcement of discrimination laws can help to reduce disparities in economic outcomes with respect to race, ethnicity, and gender in the United States. However, the data necessary to detect possible discrimination and to act to counter it is not publicly available — in particular, data on racial, ethnic, and gender disparities within specific companies. In this paper, we explore and develop methods to use information extracted from publicly available LinkedIn data to measure the racial, ethnic, and gender composition of company workforces. We use predictive tools based on both names and pictures to identify race, ethnicity, and gender. We show that one can use LinkedIn data to obtain reasonably reliable measures of workforce demographic composition by race, ethnicity, and gender, based on validation exercises comparing estimates from scraped LinkedIn data to two sources — ACS data, and company diversity or EEO-1 reports. And we apply our methods to study the race, ethnic, and gender composition of workers who experienced a mass layoff at a large company.