Predicting College Closures and Financial Distress
In this paper, we assemble the most comprehensive dataset to date on the characteristics of colleges and universities, including dates of operation, institutional setting, student body, staff, and finance data from 2002-2023. We provide an extensive description of what is known and unknown about closed colleges compared with institutions that did not close. First, we develop a series of predictive model of financial distress, utilizing factors like operational revenue/expense patterns, sources of revenue, metrics of liquidity and leverage, enrollment/staff patterns, and prior signs of significant financial strain. We benchmark these models against existing federal government screening mechanisms such as financial responsibility scores and heightened cash monitoring. We document a high degree of missing data among colleges which eventually close and show that this is a key impediment to identifying at risk institutions. We then show that modern machine learning techniques combined with richer data are more effective at predicting college closures than linear probability models, and far more effective than existing accountability metrics. Such algorithms can greatly improve the predictive accuracy even for institutions with complete data, but are particularly helpful for predicting instances of college financial distress for institutions with spotty data. Finally, we conduct simulations using our estimates to contemplate likely increases in future closures, showing that enrollment challenges resulting from an impending demographic cliff are likely to significantly increase annual college closures for reasonable scenarios.