• Operationalization of Artificial Intelligence Applications in the Intensive Care Unit: A Systematic Review

    Willemijn E M Berkhout, Julia J van Wijngaarden, Jessica D Workum, Davy van de Sande, Denise E Hilling, Christian Jung, Geert Meyfroidt, Diederik Gommers, Stefan N R Buijsman, Michel E van Genderen
    JAMA Netw Open. 2025 Jul 1;8(7):e2522866. doi: 10.1001/jamanetworkopen.2025.22866.

    Abstract

    Importance: Artificial intelligence (AI) presents transformative opportunities to address the increasing challenges faced by health care systems globally. Particularly, in data-rich environments, such as intensive care units (ICUs), AI could assist in enhancing clinical decision-making, streamline workflows, and improve patient outcomes. Despite these promising applications, the practical implementation of AI in clinical settings remains limited.

    Objective: To systematically evaluate AI system operationalization in the ICU, focusing on the AI field's progress over time, technical maturity, and risk of bias.

    Evidence review: In this systematic review, 5 databases (Embase, MEDLINE ALL, Web of Science Core Collection, Cochrane Central Register of Controlled Trials, and Google Scholar) were searched for studies published from July 28, 2020, to June 10, 2024. Eligible studies evaluated AI applications designed for use within ICUs for adults (aged ≥16 years) and used data collected during ICU stays. Two reviewers independently screened titles and abstracts, with a third reviewer resolving disagreements. Data extraction included AI application aims, dataset origins, technology readiness level (TRL) categorization, and the use of reporting standards. Risk of bias was assessed using the PROBAST (Prediction Model Study Risk of Bias Assessment Tool).

    Findings: Of 17 401 screened records, 1263 studies met the inclusion criteria. A total of 936 studies (74% of all studies) were classified as TRL 4 or below, indicating early-stage development or initial validation. Among these, 447 (37%) used internal datasets, 562 (46%) used MIMIC (Medical Information Mart for Intensive Care) datasets (I-IV), and 78 (6%) used the open-source eICU Collaborative Research Database. External validation (TRL 5) was achieved by 24% of studies. Only 25 (2%) progressed to clinical integration (TRL≥6), with no studies reaching full implementation (TRL 9). Although approximately half of generative AI models reached a higher TRL (14 [47%] with TRL 5), none reached clinical integration. Additionally, only 207 studies (16%) referenced reporting standards, with adherence modestly increasing from 14% in 2021 to 23% in 2024. High risk of bias was identified in 581 of 1103 studies (53%), primarily due to methodologic shortcomings in the analysis domain.

    Conclusions and relevance: Despite substantial growth in AI research within intensive care medicine in recent years, the transition from development to clinical implementation still remains limited and has made little progress over time. A paradigm shift is urgently required in the medical literature-one that moves beyond retrospective validation toward the operationalization and prospective testing of AI for tangible clinical impact.