A Mixed Integer Programming Formulation for Risk Stratification
Rachda Naila Mekhaldi, Julia Fleck, Raksmey PHAN, Xiaolan Xie
AI summary
Problem
Existing risk stratification methods rely on opaque machine learning models or rigid threshold rules that lack interpretability and clinical transparency. This gap hinders the design of targeted interventions and optimal resource allocation for at-risk patients.
Approach
The authors frame risk stratification as a Mixed Integer Programming problem to select optimal combinations of representative patient profiles. This approach assigns continuous risk scores (0–1) and groups patients into low, medium, and high-risk categories based on quartile thresholds.
Key results
- First MIP-based formulation for interpretable patient risk stratification
- Continuous risk scores enabling dynamic clinical thresholding
- Optimal selection of representative profiles for distinct risk groups
- Validated on public and proprietary datasets for accidental fall risk
Why it matters
Enables healthcare providers to deploy transparent, data-driven stratification tools that improve resource allocation and guide targeted preventive care.
Abstract
Risk stratification is the process of segmenting pa- tients into distinct groups of similar complexity and care needs in order to improve resource allocation. Patients are typically risk stratified using statistical or machine learning methods that generate an individual risk score for some measure of resource use. One of the main limitations of existing methods is reduced interpretability, which is often inherent to artificial intelligence techniques. In this work, we propose a novel risk stratification approach that optimizes the representation of different patient groups and generates interpretable risk profiles. We associate risk scores to patient profiles and determine the optimal combination of representative profiles for each patient group using a Mixed Integer Programming (MIP) formulation. We generate continuous ratings for patient risk scores ranging from 0 to 1 that allow for dynamic thresholding. Our method stratifies patients into several risk groups (e.g., low, medium, high risk), which is frequently more clinically significant than binary classification. We apply our approach to both public and proprietary real data in the context of accidental fall risk assessment and show that the generated risk profiles provide clinical insights that can be used for the design of targeted interventions.