Predicting College Closures and Financial Distress
In this paper, we assemble the most comprehensive dataset to date on the characteristics of colleges and universities, including dates of operation, institutional setting, student body, staff, and finance data from 2002 to 2023. We provide an extensive description of what is known and unknown about closed colleges compared with institutions that did not close. Using this data, we first develop a series of predictive models of financial distress, utilizing factors like operational revenue/expense patterns, sources of revenue, metrics of liquidity and leverage, enrollment/staff patterns, and prior signs of significant financial strain. We benchmark these models against existing federal government screening mechanisms such as financial responsibility scores and heightened cash monitoring. We document a high degree of missing data among colleges that eventually close and show that this is a key impediment to identifying at risk institutions. We then show that modern machine learning techniques, combined with richer data, are far more effective at predicting college closures than linear probability models, and considerably more effective than existing accountability metrics. Our preferred model, which combines an off-the-shelf machine learning algorithm with the richest set of explanatory variables, can significantly improve predictive accuracy even for institutions with complete data, but is particularly helpful for predicting instances of financial distress for institutions with spotty data. Finally, we conduct simulations using our estimates to contemplate likely increases in future closures, showing that enrollment challenges resulting from an impending demographic cliff are likely to significantly increase annual college closures for reasonable scenarios.