Usefulness of ChatGPT in physiotherapy practice: A scoping review.
Researchers
Brian Crinion, Cuisle Forde, Teresa Ryan, David Mockler, Julie Broderick
Abstract
To map the current evidence examining the use of ChatGPT within musculoskeletal (MSK) physiotherapy, with a focus on how the technology has been studied rather than on claims of effectiveness or readiness for clinical practice. A scoping review was conducted in accordance with PRISMA-ScR guidance. Electronic searches were performed in MEDLINE (via PubMed), Embase, CINAHL, PEDro and the Cochrane Library, with the most recent update conducted in November 2025. Empirical studies evaluating ChatGPT in physiotherapy-relevant or MSK rehabilitation contexts were included in this review. Included studies were mapped descriptively according to clinical task, study design, comparator, outcome domain, and interaction conditions, including prompt structure. No critical appraisal of effectiveness or safety was undertaken. Fourteen studies published between 2023 and 2025 met the inclusion criteria. Study designs, clinical tasks, comparators and outcome measures were highly heterogeneous. ChatGPT was evaluated across diagnostic, treatment, educational and informational tasks, using different types of MSK scenario or guideline-derived questions. Outcome domains included agreement with expert judgement, alignment with clinical practice guidelines, and ratings of clarity or appropriateness. No shared or standardised evaluation framework was used across studies. Apparent differences in reported agreement between diagnostic and treatment-related tasks were largely attributable to differences in task framing, reference standards, and evaluation methods rather than intrinsic model capability. The current evidence base examining ChatGPT in MSK physiotherapy is exploratory and methodologically immature. Reported accuracy or alignment metrics cannot be interpreted independently of study design and interaction conditions. Future research should prioritise physiotherapy-focused study designs that clearly define diagnostic and treatment tasks, specify prompt and interaction conditions, and examine how artificial-intelligence-generated outputs are interpreted and used by clinicians, students and patients.Source: PubMed (PMID: 42365742)View Original on PubMed