Share your details and we'll email you our Publication
Authors:
Eran N. Choman
Alon Lanyado
Prof. Michael K. Gould
MD
MS
Title
Flagging high-risk individuals with a ML model improves NSCLC early detection in a USPSTF-eligible population
Background & Aims
The USPSTF recommends annual lung cancer screening with LDCT in adults aged 50 to 80 years who have a ≥20 pack- year smoking history and currently smoke or have quit within the past 15 years. Risk prediction models are an alternative approach to identify high-risk individuals for screening that may have advantages compared to age and smoking history- based selection. We compared the performance of two risk prediction models, LungFlag and adapted to EHR data PLCOm2012 (mPLCOm2012).
Methods
Data from a large US health system including 6,505 case patients with non-small cell lung cancer (NSCLC) and 189,597 contemporaneous NSCLC-free controls were used to evaluate the performance of an optimized version of a previously published machine-learning model (LungFlag) to detect NSCLC among individuals who meet the USPSTF criteria compared to the performance of mPLCOm2012. The model used existing routine out-patient lab measurements, smoking history, comorbidities, and demographic data.
Results
Data were analyzed using the area under the receiver operating characteristic curve (AUC), and diagnostic sensitivity on the USPSTF screen-eligible population (Tables 1-3 in PDF) and Ever Smokers ages 50-80 population (Table 4 in PDF). The risk predictor was calculated for a 3-12-month window prior to the diagnosis date (Dx) using cut-offs yielding specificity levels of 97%, 95% or 90%.
Conclusions
By using available information existed in the EHR, the LungFlag model was more accurate for early diagnosis of NSCLC than mPLCOm2012, demonstrating the potential to help prevent lung cancer deaths through early detection among the sub- group of USPSTF as well as the Ever Smokers population.