• Artificial Intelligence-Augmented Human Instruction and Surgical Simulation Performance: A Randomized Clinical Trial

    Bianca Giglio, Abdulmajeed Albeloushi, Ahmad Kh Alhaj, Mohamed Alhantoobi, Rothaina Saeedi, Vanja Davidovic, Abicumaran Uthamacumaran, Recai Yilmaz, Jason Lapointe, Neevya Balasubramaniam, Trisha Tee, Ali M Fazlollahi, Jos� A Correa, Rolando F Del Maestro
    JAMA Surg. 2025 Aug 6:e252564. doi: 10.1001/jamasurg.2025.2564. Online ahead of print.

    Abstract

    Importance: How the Intelligent Continuous Expertise Monitoring System, an artificial intelligence tutoring system, might be best optimized for surgical training is unknown.

    Objective: To determine the effects of artificial intelligence-augmented personalized expert instruction vs intelligent tutoring alone on surgical performance, skill transfer, and affective-cognitive responses.

    Design, setting, and participants: This single-blinded randomized clinical trial was conducted among a volunteer sample of medical students in preparatory, first, or second year without prior use of a virtual reality surgical simulator (NeuroVR) at the McGill Neurosurgical Simulation and Artificial Intelligence Learning Centre in Montreal, Quebec, Canada. Cross-sectional data were collected from March to September 2024, and per-protocol data analysis was conducted in March 2025.

    Intervention: During simulated surgical procedures, trainees received 1 of 3 feedback methods. Group 1 received only intelligent tutor instruction (control). The 2 intervention arms included group 2, which received expert feedback in identical words to the intelligent tutor, and group 3, which received artificial intelligence data-informed personalized expert feedback.

    Main outcomes and measures: The coprimary outcomes included change in overall surgical performance across practice resections and skill transfer to a complex realistic scenario, measured by artificial intelligence-calculated composite expertise score (range, -1.00 [novice] to 1.00 [expert]). Secondary outcomes included emotional and cognitive demands, measured via questionnaires.

    Results: In this randomized clinical trial, the final analysis included 87 medical students (46 [53%] women; mean [SD] age, 22.7 [4.0] years), with 30, 29, and 28 participants in groups 1, 2, and 3, respectively. Group 3 achieved significantly higher scores than group 1 across several trials, including trial 5 (mean difference, 0.26; 95% CI, 0.09-0.43; P = .01) and the realistic task (mean difference, 0.20; 95% CI, 0.06-0.34; P = .02). Group 3 also achieved significantly better scores than the other 2 groups in certain metrics, such as bleeding and injury risk. Emotions and cognitive load demonstrated significant differences.

    Conclusions and relevance: In this randomized clinical trial, personalized expert instruction resulted in enhanced surgical performance and skill transfer compared with intelligent tutor instruction, highlighting the importance of human input and participation in artificial intelligence-based surgical training.