About Us
Research Watch

Construction and Validation of Active Case-Finding Tool in Community Participants with Chronic Obstructive Pulmonary Disease Using an Interpretable Machine Learning Approach.

Researchers

Heshen Tian, Fan Wu, Chuanqi Sun, Zhishan Deng, Yumin Zhou, Pixin Ran

Abstract

Early diagnosis is an effective strategy in chronic obstructive pulmonary disease (COPD) prevention. Active case-finding is an effective approach, but traditional tools such as COPD-SQ are limited by outdated data, poor extrapolation, and singular binary prediction. This study aimed to develop an updated, convenient, and interpretable machine learning tool for COPD screening in community participants. Data for model training and external validation were obtained from two community-based studies in Guangdong, China. PyCaret and R programming language were used to develop machine learning models. Thirty original items, including demographic data, clinical features, and risk factor data, were initially used. Eleven machine learning classification models were compared, and the least absolute shrinkage and selection operator was further used to shrink predictors. Model performance was evaluated using ROC, AUC, accuracy, sensitivity, specificity, and other metrics. Shapley Additive exPlanations were used to interpret the models. A total of 5381 and 2456 participants from the training and external validation cohorts were included, respectively. In predicting COPD, the AdaBoost model showed the best performance, with an accuracy of 0.846 and an AUC of 0.848. For GOLD classification prediction, the model achieved an overall accuracy of 0.822 and an AUC of 0.816, and identified 83% of moderate-to-severe COPD in the community. In regression analysis, the gradient boosting regression model showed good consistency between predicted and measured FEV<sub>1</sub> %pred and FEV<sub>1</sub>/FVC values. The models also demonstrated good performance in the external validation cohort and were deployed online. We constructed an active case-finding tool with integrated machine learning models for predicting COPD, COPD severity, and lung function parameters using limited clinical data. This tool may help prioritize high-risk individuals for confirmatory spirometry in community settings. Future implementation studies should evaluate its effect on referral efficiency, diagnostic yield, treatment uptake, and long-term outcomes.
Source: PubMed (PMID: 42333377)View Original on PubMed