Development and Validation of a Natural Language Processing Algorithm Using Electronic Health Record Data to Identify Breast Cancer Patients With Low Social Support.
Researchers
Candyce H Kroenke, Rhonda Aoki, Lauren Mammini, Stacey Alexeeff, Salene M W Jones, Lawrence H Kushi, Jessica Mogk, Shaila Strayhorn-Carter, David Mosen, David Cronkite
Abstract
Social support is important in the management of breast cancer. We developed a series of structured variables or modules from electronic health record (EHR) data to form the basis for the development of EHRsupport, an EHR-based measure of social support. We built a natural language processing (NLP) algorithm from clinical notes in 7,989 women diagnosed from January 2006 to September 2021 with invasive breast cancer. Our team reviewed charts, developed a lexicon, manually crafted rules, applied the NLP algorithm to 565,258 EHR notes and 68,760 patient messages -1 to +3 months around diagnosis, combined terms into separate modules, and tuned (N = 40), updated, and tested (N = 100) these versus chart review. We further developed modules from structured data sources. We identified and developed 11 modules from unstructured data: spouse/partner status, parenthood status, (clinical) visit support, living situation (alone or with others), friend or other support, positive social support, negative (lack of) social support, deceased person, transportation issues, relationship conflict or stress, and social isolation. Module data availability ranged from 1.4% for social isolation to 92.0% for spouse/partner. Modules were accurate (0.81-0.95). The most highly available groups had moderate to excellent F1 scores (0.75-1.00), precision or positive predictive value (0.61-1.00), and recall or sensitivity (0.75-1.00). Recall and precision were more variable among rarer modules. Five percent of patients lacked an in-state emergency contact and 3% of patients had diagnostic codes for low support at any time point. The EHRsupport algorithm accurately identified social support data, supporting the development of a clinical tool that can be used to identify patients with low social support.Source: PubMed (PMID: 42063534)View Original on PubMed