Unsupervised Machine Learning for Explainable Health Care Fraud Detection

Shubhranshu Shekhar; Jetson Leder-Luis; Leman Akoglu

doi:10.3386/w30946

Unsupervised Machine Learning for Explainable Health Care Fraud Detection

Shubhranshu Shekhar, Jetson Leder-Luis & Leman Akoglu

Working Paper 30946

DOI 10.3386/w30946

Issue Date February 2023

The US spends more than 4 trillion dollars per year on health care, largely conducted by private providers and reimbursed by insurers. A major concern in this system is overbilling, waste and fraud by providers, who face incentives to misreport on their claims in order to receive higher payments. In this work, we develop novel machine learning tools to identify providers that overbill insurers. Using large-scale claims data from Medicare, the US federal health insurance program for elderly adults and the disabled, we identify patterns consistent with fraud or overbilling among inpatient hospitalizations. Our proposed approach for fraud detection is fully unsupervised, not relying on any labeled training data, and is explainable to end users, providing reasoning and interpretable insights into the potentially suspicious behavior of the flagged providers. Data from the Department of Justice on providers facing anti-fraud lawsuits and case studies of suspicious providers validate our approach and findings. We also perform a post-analysis to understand hospital characteristics, those not used for detection but associate with a high suspiciousness score. Our method provides an 8-fold lift over random targeting, and can be used to guide investigations and auditing of suspicious providers for both public and private health insurance systems.

Research reported in this publication was supported by the National Institute on Aging of the National Institutes of Health under Award Number P30AG012810. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We thank the NBER for data access and support and Lowell Taylor for his contributions to earlier stages of the project. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.

Jetson Leder-Luis
Jetson Leder-Luis receives compensation for data analysis related to health care fraud litigation.
Copy Citation

Shubhranshu Shekhar, Jetson Leder-Luis, and Leman Akoglu, "Unsupervised Machine Learning for Explainable Health Care Fraud Detection," NBER Working Paper 30946 (2023), https://doi.org/10.3386/w30946.

Download Citation

MARC RIS BibTeΧ
- Github project

Unsupervised Machine Learning for Explainable Health Care Fraud Detection

Related

Topics

Programs

Working Groups

Projects

More from the NBER