google ads
Deep Learning: Deep Learning and Musculoskeletal Apps Imaging Pearls - Educational Tools | CT Scanning | CT Imaging | CT Scan Protocols - CTisus
Imaging Pearls ❯ Deep Learning ❯ Deep Learning and Musculoskeletal Apps

-- OR --

  • Background: As the number of conventional radiographic examinations in pediatric emergency departments increases, so, too, does the number of reading errors by radiologists.
    Objective: The aim of this study is to investigate the ability of artificial intelligence (AI) to improve the detection of fractures by radiologists in children and young adults.
    Materials and methods: A cohort of 300 anonymized radiographs performed for the detection of appendicular fractures in patients ages 2 to 21 years was collected retrospectively. The ground truth for each examination was established after an independent review by two radiologists with expertise in musculoskeletal imaging. Discrepancies were resolved by consensus with a third radiologist. Half of the 300 examinations showed at least 1 fracture. Radiographs were read by three senior pediatric radiologists and five radiology residents in the usual manner and then read again immediately after with the help of AI.
    Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists
    Toan Nguyen et al.
    Pediatric Radiology (2022) 52:2215–2226
  • Results: The mean sensitivity for all groups was 73.3% (110/150) without AI; it increased significantly by almost 10% (P<0.001) to 82.8% (125/150) with AI. For junior radiologists, it increased by 10.3% (P<0.001) and for senior radiologists by 8.2% (P=0.08). On average, there was no significant change in specificity (from 89.6% to 90.3% [+0.7%, P=0.28]); for junior radiologists, specificity increased from 86.2% to 87.6% (+1.4%, P=0.42) and for senior radiologists, it decreased from 95.1% to 94.9% (-0.2%, P=0.23). The stand-alone sensitivity and specificity of the AI were, respectively, 91% and 90%.
    Conclusion: With the help of AI, sensitivity increased by an average of 10% without significantly decreasing specificity in fracture detection in a predominantly pediatric population.
     Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists
    Toan Nguyen et al.
    Pediatric Radiology (2022) 52:2215–2226
  • “We have shown that the diagnostic performance of junior and senior radiologists for fracture detection from conventional radiographs can be improved with the assistance of AI. The study confirms that AI is suitable for bone fracture detection in clinical practice even for young children. A prospective evaluation in a setting closer to the real-life scenario should be considered.”
    Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists
    Toan Nguyen et al.
    Pediatric Radiology (2022) 52:2215–2226
  • “Second, our study was retrospective in nature, with readers in artificial reading conditions, which could affect their reading. Moreover, the performance of readers was assessed solely on their ability to make decisions from the radiograph alone, without any of the clinical information or medical history that can be crucial in decision-making, creating a context bias. This same limitation applies to the radiologists who determined the ground truth, as they also worked without clinical information. Clinical information could have increased the sensitivity and specificity of readers and would have been more akin to daily practice. Furthermore, in everyday practice, indications are diverse and do not concern only trauma. Finally, reading with AI immediately after reading without AI could have introduced some bias. A study with clinical information, two separate phases and a washout period in between should be considered to remove these biases.”
    Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists
    Toan Nguyen et al.
    Pediatric Radiology (2022) 52:2215–2226
  • Background: As the number of conventional radiographic examinations in pediatric emergency departments increases, so, too, does the number of reading errors by radiologists.
    Objective: The aim of this study is to investigate the ability of artificial intelligence (AI) to improve the detection of fractures by radiologists in children and young adults.
    Materials and methods: A cohort of 300 anonymized radiographs performed for the detection of appendicular fractures in patients ages 2 to 21 years was collected retrospectively. The ground truth for each examination was established after an independent review by two radiologists with expertise in musculoskeletal imaging. Discrepancies were resolved by consensus with a third radiologist. Half of the 300 examinations showed at least 1 fracture. Radiographs were read by three senior pediatric radiologists and five radiology residents in the usual manner and then read again immediately after with the help of AI.
    Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists
    Toan Nguyen et al.
    Pediatric Radiology (2022) 52:2215–2226
  • Results: The mean sensitivity for all groups was 73.3% (110/150) without AI; it increased significantly by almost 10% (P<0.001) to 82.8% (125/150) with AI. For junior radiologists, it increased by 10.3% (P<0.001) and for senior radiologists by 8.2% (P=0.08). On average, there was no significant change in specificity (from 89.6% to 90.3% [+0.7%, P=0.28]); for junior radiologists, specificity increased from 86.2% to 87.6% (+1.4%, P=0.42) and for senior radiologists, it decreased from 95.1% to 94.9% (-0.2%, P=0.23). The stand-alone sensitivity and specificity of the AI were, respectively, 91% and 90%.
    Conclusion: With the help of AI, sensitivity increased by an average of 10% without significantly decreasing specificity in fracture detection in a predominantly pediatric population.
     Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists
    Toan Nguyen et al.
    Pediatric Radiology (2022) 52:2215–2226
  • “We have shown that the diagnostic performance of junior and senior radiologists for fracture detection from conventional radiographs can be improved with the assistance of AI. The study confirms that AI is suitable for bone fracture detection in clinical practice even for young children. A prospective evaluation in a setting closer to the real-life scenario should be considered.”
    Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists
    Toan Nguyen et al.
    Pediatric Radiology (2022) 52:2215–2226
  • “Second, our study was retrospective in nature, with readers in artificial reading conditions, which could affect their reading. Moreover, the performance of readers was assessed solely on their ability to make decisions from the radiograph alone, without any of the clinical information or medical history that can be crucial in decision-making, creating a context bias. This same limitation applies to the radiologists who determined the ground truth, as they also worked without clinical information. Clinical information could have increased the sensitivity and specificity of readers and would have been more akin to daily practice. Furthermore, in everyday practice, indications are diverse and do not concern only trauma. Finally, reading with AI immediately after reading without AI could have introduced some bias. A study with clinical information, two separate phases and a washout period in between should be considered to remove these biases.”
    Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists
    Toan Nguyen et al.
    Pediatric Radiology (2022) 52:2215–2226
  • Purpose: To conduct a prospective observational study across 12 U.S. hospitals to evaluate real-time performance of an interpretable artificial intelligence (AI) model to detect COVID-19 on chest radiographs.
    Materials and Methods: A total of 95 363 chest radiographs were included in model training, external validation, and real-time validation. The model was deployed as a clinical decision support system, and performance was prospectively evaluated. There were 5335 total real-time predictions and a COVID-19 prevalence of 4.8% (258 of 5335). Model performance was assessed with use of receiver operating characteristic analysis, precision-recall curves, and F1 score. Logistic regression was used to evaluate the association of race and sex with AI model diagnostic accuracy. To compare model accuracy with the performance of board-certified radiologists, a third dataset of 1638 images was read independently by two radiologists.
    Conclusion: AI-based tools have not yet reached full diagnostic potential for COVID-19 and underperform compared with radiologist prediction.
    Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study
    Ju Sun, et al.
    Radiology: Artificial Intelligence 2022; 4(4):e210217
  • Summary
    This 12-site prospective study characterizes the real-time performance of an artificial intelligence–based diagnostic tool for COVID-19, which may serve as an adjunct to, but not as a replacement for, clinical decision-making in the diagnosis of COVID-19.  
    Key Points
    •  The COVID-19 artificial intelligence (AI) diagnostic tool achieved an area under the receiver operating characteristic curve of 0.70 on real-time validation.
    • At equity and subgroup analysis, the AI tool demonstrated improved diagnostic capabilities in participants with more severe disease and in non-White participants, improved sensitivity in men, and improved specificity in women during real-time and external validations.
    •  The COVID-19 AI diagnostic system had significantly lower accuracy (63.5%) compared with radiologists (radiologist 1 = 67.8% correct, radiologist 2 = 68.6% correct; McNemar P , .001).
    Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study
    Ju Sun, et al.
    Radiology: Artificial Intelligence 2022; 4(4):e210217
  • “In conclusion, AI-based diagnostic tools may serve as an adjunct to, but not a replacement for, clinical decision-making concerning COVID-19 diagnosis, which largely hinges on exposure history, signs, and symptoms. Although AI-based tools have not yet reached full diagnostic potential in COVID-19, they may still offer valuable information to clinicians when taken into consideration along with clinical signs and symptoms.”
    Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study
    Ju Sun, et al.
    Radiology: Artificial Intelligence 2022; 4(4):e210217
  • Background: Cinematic Rendering (CR) is a recently introduced post-processing three-dimensional (3D) visualization imaging tool. The aim of this study was to assess its clinical value in the preoperative planning of deep inferior epigastric artery perforator (DIEP) or muscle-sparing transverse rectus abdominis myocutaneous (MS-TRAM) flaps, and to compare it with maximum intensity projection (MIP) images. The study presents the first application of CR for perforator mapping prior to autologous breast reconstruction
    Conclusion: The current study serves as an explorative study, showing first experiences with CR in abdominal-based autologous breast reconstruction. In addition to MIP images, CR might improve the surgeon’s understanding of the individual’s anatomy. Future studies are required to compare CR with other 3D visualization tools and its possible effects on operative parameters
    The third dimension in perforator mapping—Comparison of Cinematic Rendering and maximum intensity projection in abdominal-based autologous breast reconstruction
    Journal of Plastic, Reconstructive & Aesthetic Surgery 75 (2022) 536–543  
  • “CR is a promising 3D visualization technique and can assist surgeons in understanding the patient’s anatomy with all de- tails. There is a continuous development of the underlying algorithms, and the use of artificial intelligence might help to refine the tool in newer software versions. Future studies may consider comparison of CR with other 3D visualization tools. Moreover, the technique might aid the understanding of anatomy in different fields of reconstructive surgery, for example, in the treatment of complex tissue defects or hand surgery. Future studies are needed to investigate the use of CR in other free flap options and its possible positive effects concerning operative parameters, for example, flap harvest time or the occurrence of intraoperative complications”
    The third dimension in perforator mapping—Comparison of Cinematic Rendering and maximum intensity projection in abdominal-based autologous breast reconstruction
    Journal of Plastic, Reconstructive & Aesthetic Surgery 75 (2022) 536–543 
  • Background: Patients with fractures are a common emergency presentation and may be misdiagnosed at radiologic imaging. An increasing number of studies apply artificial intelligence (AI) techniques to fracture detection as an adjunct to clinician diagnosis.
    Purpose: To perform a systematic review and meta-analysis comparing the diagnostic performance in fracture detection between AI and clinicians in peer-reviewed publications and the gray literature (ie, articles published on preprint repositories).
    Conclusion: Artificial intelligence (AI) and clinicians had comparable reported diagnostic performance in fracture detection, suggesting that AI technology holds promise as a diagnostic adjunct in future clinical practice.
    Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis
    Rachel Y. L. Kuo et al.
    Radiology 2022; 304:50–62
  • Summary
    Artificial intelligence is noninferior to clinicians in terms of diagnostic performance in fracture detection, showing promise as a useful diagnostic tool.
    Key Results
    • In a systematic review and meta-analysis of 42 studies (37 studies with radiography and five studies with CT), the pooled diagnostic performance from the use of artificial intelligence (AI) to detect fractures had a sensitivity of 92% and 91% and specificity of 91% and 91%, on internal and external validation, respectively.
    • Clinician performance had comparable performance to AI in fracture detection (sensitivity 91%, 92%; specificity 94%, 94%).
    • Only 13 studies externally validated results, and only one study evaluated AI performance in a prospective clinical trial.
    Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis
    Rachel Y. L. Kuo et al.
    Radiology 2022; 304:50–62
  • “Future research should seek to externally validate algorithms in prospective clinical settings and provide a fair comparison with relevant clinicians: for example, providing clinicians with routine clinical detail. External validation and evaluation of algorithms in prospective randomized clinical trials is a necessary next step toward clinical deployment. Current artificial intelligence (AI) is designed as a diagnostic adjunct and may improve workflow through screening or prioritizing images on worklists and highlighting regions of interest for a reporting radiologist. AI may also improve diagnostic certainty through acting as a “second reader” for clinicians or as an interim report prior to radiologist interpretation. However, it is not a replacement for the clinical workflow, and clinicians must understand AI performance and exercise judgement in interpreting algorithm output. We advocate for transparent reporting of study methods and results as crucial to AI integration. By addressing these areas for development, deep learning has potential to streamline fracture diagnosis in a way that is safe and sustainable for patients and health care systems.”
    Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis
    Rachel Y. L. Kuo et al.
    Radiology 2022; 304:50–62
  • “The status of AI in medical imaging in the next 10 years will depend on regulatory policy, reimbursement models, success in the incorporation of AI into routine workflow, development and adoption of standards versus platforms for AI applications, and the level of success in generalizing deep learning algorithms to different machines, geographies, and diverse patient populations to minimize bias.”
    Future Directions in Artificial Intelligence
    Babak Saboury, Michael Morris, MD, Eliot Siegel
    Radiol Clin N Am 59 (2021) 1085–1095
  • Background: Artificial Intelligence (AI)/Machine Learning (ML) applications have been proven efficient to improve diagnosis, to stratify risk, and to predict outcomes in many respective medical specialties, including in orthopaedics.
    Challenges and Discussion: Regarding hip and knee reconstruction surgery, AI/ML have not made it yet to clinical practice.In this review, we present sound AI/ML applications in the field of hip and knee degenerative disease and reconstruction.From osteoarthritis (OA) diagnosis and prediction of its advancement, clinical decision-making, identification of hip andknee implants to prediction of clinical outcome and complications following a reconstruction procedure of these joints, we report how AI/ML systems could facilitate data-driven personalized care for our patients.
    Applications of artificial intelligence and machine learning for the hip and knee surgeon: current state and implications for the future
    Christophe Nich et al.
    International Orthopaedics (2022) 46:937–944
  • “In a near future, AI/ML will probably provide the orthopaedic surgeon with key tools in an increasingly data-driven and data-dependent world. As the amount of patient-related data continues to grow, it is becoming evident that medical decisions will increasingly have recourse to AI/ML. The latter will need to be incorporated into the daily practice, with the help of automated algorithms for computers. Also, it is probable that advanced ML systems will overcome the problem of missing data. Advances in unsupervised learning will enable far greater characterization of patient’s risk factors for complications or failure following hip or knee reconstruction. Ultimately, this will lead to better surgical technique selection, improved outcomes,and lower healthcare costs.”
    Applications of artificial intelligence and machine learning for the hip and knee surgeon: current state and implications for the future
    Christophe Nich et al.
    International Orthopaedics (2022) 46:937–944

  • Applications of artificial intelligence and machine learning for the hip and knee surgeon: current state and implications for the future
    Christophe Nich et al.
    International Orthopaedics (2022) 46:937–944
  • “Hip fractures are a major cause of morbidity and mortality in the elderly, and incur high health and social care costs. Given projected population ageing, the number of incident hip fractures is predicted to increase globally. As fracture classification strongly determines the chosen surgical treatment, differences in fracture classification influence patient outcomes and treatment costs. We aimed to create a machine learning method for identifying and classifying hip fractures, and to compare its performance to experienced human observers. We used 3659 hip radiographs, classified by at least two expert clinicians. The machine learning method was able to classify hip fractures with 19% greater accuracy than humans, achieving overall accuracy of 92%.”
    Machine learning outperforms  clinical experts in classification  of hip fractures  
    E. A. Murphy et al.
    Scientific Reports (Nature) (2022) 12:2058 
  • "In this work, we have demonstrated that a trained neural network can classify hip fractures with 19% increased accuracy compared to human observers with experience of hip fracture classification in a clinical setting. In the work presented here, we used as ground truth the classification of 3,659 hip radiographs by at least two (and up to five) experts to achieve consensus. Thus, this analysis is a prototype only and a more extensive study is needed before this approach can be fully transformed to a clinical application. We envisage that this approach could be used clinically and aid in the diagnosis and in the treatment of patients who sustain hip fractures.”
    Machine learning outperforms  clinical experts in classification  of hip fractures  
    E. A. Murphy et al.
    Scientific Reports (Nature) (2022) 12:2058 
  • Background: Patients with fractures are a common emergency presentation and may be misdiagnosed at radiologic imaging. An increasing number of studies apply artificial intelligence (AI) techniques to fracture detection as an adjunct to clinician diagnosis.
    Purpose: To perform a systematic review and meta-analysis comparing the diagnostic performance in fracture detection between AI and clinicians in peer-reviewed publications and the gray literature (ie, articles published on preprint repositories).  
    Materials and Methods: A search of multiple electronic databases between January 2018 and July 2020 (updated June 2021) was per- formed that included any primary research studies that developed and/or validated AI for the purposes of fracture detection at any imaging modality and excluded studies that evaluated image segmentation algorithms. Meta-analysis with a hierarchical model to calculate pooled sensitivity and specificity was used. Risk of bias was assessed by using a modified Prediction Model Study Risk of Bias Assessment Tool, or PROBAST, checklist.  
    Artificial Intelligence in Fracture Detection:  A Systematic Review and Meta-Analysis  
    Kuo RYL et al.
    Radiology 2022; 000:1–13 • https://doi.org/10.1148/radiol.211785 
  • Results: Included for analysis were 42 studies, with 115 contingency tables extracted from 32 studies (55 061 images). Thirty-seven studies identified fractures on radiographs and five studies identified fractures on CT images. For internal validation test sets, the pooled sensitivity was 92% (95% CI: 88, 93) for AI and 91% (95% CI: 85, 95) for clinicians, and the pooled specificity was 91% (95% CI: 88, 93) for AI and 92% (95% CI: 89, 92) for clinicians. For external validation test sets, the pooled sensitivity was 91% (95% CI: 84, 95) for AI and 94% (95% CI: 90, 96) for clinicians, and the pooled specificity was 91% (95% CI: 81, 95) for AI and 94% (95% CI: 91, 95) for clinicians. There were no statistically significant differences between clinician and AI performance. There were 22 of 42 (52%) studies that were judged to have high risk of bias. Meta-regression identified multiple sources of heterogeneity in the data, including risk of bias and fracture type.  
    Conclusion: Artificial intelligence (AI) and clinicians had comparable reported diagnostic performance in fracture detection, suggesting that AI technology holds promise as a diagnostic adjunct in future clinical practice.  
    Artificial Intelligence in Fracture Detection:  A Systematic Review and Meta-Analysis  
    Kuo RYL et al.
    Radiology 2022; 000:1–13 • https://doi.org/10.1148/radiol.211785 
  • Summary  
    Artificial intelligence is noninferior to clinicians in terms of diagnostic performance in fracture detection, showing promise as a useful diagnostic tool.  
    Key Results  
    • In a systematic review and meta-analysis of 42 studies (37 studies with radiography and five studies with CT), the pooled diagnostic performance from the use of artificial intelligence (AI) to detect fractures had a sensitivity of 92% and 91% and specificity of 91% and 91%, on internal and external validation, respectively.
    • Clinician performance had comparable performance to AI in fracture detection (sensitivity 91%, 92%; specificity 94%, 94%).  
    • Only 13 studies externally validated results, and only one study evaluated AI performance in a prospective clinical trial.  
    Artificial Intelligence in Fracture Detection:  A Systematic Review and Meta-Analysis  
    Kuo RYL et al.
    Radiology 2022; 000:1–13 • https://doi.org/10.1148/radiol.211785 
  • “Current artificial intelligence (AI) is designed as a diagnostic adjunct and may improve workflow through screening or prioritizing images on worklists and highlighting regions of interest for a reporting radiologist. AI may also improve diagnostic certainty through acting as a “second reader” for clinicians or as an interim report prior to radiologist interpretation. However, it is not a replacement for the clinical workflow, and clinicians must understand AI performance and exercise judgement in interpreting algorithm output. We advocate for transparent reporting of study methods and results as crucial to AI integration. By addressing these areas for development, deep learning has potential to streamline fracture diagnosis in a way that is safe and sustainable for patients and health care systems.”
    Artificial Intelligence in Fracture Detection:  A Systematic Review and Meta-Analysis  
    Kuo RYL et al.
    Radiology 2022; 000:1–13 • https://doi.org/10.1148/radiol.211785
  • Background: Proximal femoral fractures are an important clinical and public health issue associated with substantial morbidity and early mortality. Artificial intelligence might offer improved diagnostic accuracy for these fractures, but typical approaches to testing of artificial intelligence models can underestimate the risks of artificial intelligence- based diagnostic systems.  
    Methods: We present a preclinical evaluation of a deep learning model intended to detect proximal femoral fractures in frontal x-ray films in emergency department patients, trained on films from the Royal Adelaide Hospital (Adelaide, SA, Australia). This evaluation included a reader study comparing the performance of the model against five radiologists (three musculoskeletal specialists and two general radiologists) on a dataset of 200 fracture cases and 200 non-fractures (also from the Royal Adelaide Hospital), an external validation study using a dataset obtained from Stanford University Medical Center, CA, USA, and an algorithmic audit to detect any unusual or unexpected model behaviour.  
    Validation and algorithmic audit of a deep learning system for the detection of proximal femoral fractures in patients in the emergency department: a diagnostic accuracy study
    Lauren Oakden-Rayner et al.
    www.thelancet.com/digital-health Published online April 5, 2022 https://doi.org/10.1016/S2589-7500(22)00004-8  
  • Findings: In the reader study, the area under the receiver operating characteristic curve (AUC) for the performance of the deep learning model was 0·994 (95% CI 0·988–0·999) compared with an AUC of 0·969 (0·960–0·978) for the five radiologists. This strong model performance was maintained on external validation, with an AUC of 0·980 (0·931–1·000). However, the preclinical evaluation identified barriers to safe deployment, including a substantial shift in the model operating point on external validation and an increased error rate on cases with abnormal bones (eg, Paget’s disease).  
    Interpretation: The model outperformed the radiologists tested and maintained performance on external validation, but showed several unexpected limitations during further testing. Thorough preclinical evaluation of artificial intelligence models, including algorithmic auditing, can reveal unexpected and potentially harmful behaviour even in high-performance artificial intelligence systems, which can inform future clinical testing and deployment decisions.  
    Validation and algorithmic audit of a deep learning system for the detection of proximal femoral fractures in patients in the emergency department: a diagnostic accuracy study
    Lauren Oakden-Rayner et al.
    www.thelancet.com/digital-health Published online April 5, 2022 https://doi.org/10.1016/S2589-7500(22)00004-8  
  • Interpretation: The model outperformed the radiologists tested and maintained performance on external validation, but showed several unexpected limitations during further testing. Thorough preclinical evaluation of artificial intelligence models, including algorithmic auditing, can reveal unexpected and potentially harmful behaviour even in high-performance artificial intelligence systems, which can inform future clinical testing and deployment decisions.  
    Validation and algorithmic audit of a deep learning system for the detection of proximal femoral fractures in patients in the emergency department: a diagnostic accuracy study
    Lauren Oakden-Rayner et al.
    www.thelancet.com/digital-health Published online April 5, 2022 https://doi.org/10.1016/S2589-7500(22)00004-8  
  • Added value of this study:  This study presents a thorough preclinical evaluation of a medical artificial intelligence system (trained to detect proximal femoral fractures on plain film imaging). Despite high performance of the model, which outperformed human experts in the task of proximal femoral fracture detection, an evaluation including algorithmic auditing showed unexpected and potentially harmful algorithmic behaviour.  
    Implications of all the available evidence:  Thorough evaluation of artificial intelligence systems, including algorithmic auditing, can identify barriers to safe artificial intelligence deployment that might not be appreciated during standard preclinical testing and which could cause significant harm. Regulators, medical governance bodies, and professional groups should consider the need for more comprehensive preclinical testing of artificial intelligence before clinical deployment.  
    Validation and algorithmic audit of a deep learning system for the detection of proximal femoral fractures in patients in the emergency department: a diagnostic accuracy study
    Lauren Oakden-Rayner et al.
    www.thelancet.com/digital-health Published online April 5, 2022 https://doi.org/10.1016/S2589-7500(22)00004-8  
  • “We note that although our model shows high performance, and does not appear to deviate from human performance in prespecified subgroups,it does still make the occasional inhuman error (eg, misdiagnosing a highly displaced fracture). We also note on saliency mapping that although the model reproduces some recognizable aspects of human practice (eg, it appears to pay attention to Shenton’s line), the visualizations nonetheless raise concerns about the regions that are not highlighted in the heatmaps. In particular, the saliency maps almost never show strong activity along the outer region of the femoral neck, even in cases where the cortex in this area is clearly disrupted.”  
    Validation and algorithmic audit of a deep learning system for the detection of proximal femoral fractures in patients in the emergency department: a diagnostic accuracy study
    Lauren Oakden-Rayner et al.
    www.thelancet.com/digital-health Published online April 5, 2022 https://doi.org/10.1016/S2589-7500(22)00004-8  
  • "Our study evaluated a high-performance proximal femoral fracture detection deep learning model, which outperforms highly trained clinical specialists in diagnostic conditions, as well as other clinical readers in normal clinical conditions. The performance of the artificial intelligence system was maintained when applied to an external validation sample, and a thorough analysis of the behaviour of the artificial intelligence system shows that it is mostly consistent with that of human experts. We also characterized the occasional aberrant or unexpected behaviour of the artificial intelligence model which could inform future clinical testing protocols. We next intend to test our model in a clinical environment, in the form of an interventional randomised controlled trial.”  
    Validation and algorithmic audit of a deep learning system for the detection of proximal femoral fractures in patients in the emergency department: a diagnostic accuracy study
    Lauren Oakden-Rayner et al.
    www.thelancet.com/digital-health Published online April 5, 2022 https://doi.org/10.1016/S2589-7500(22)00004-8  
  • "Our study had a number of limitations. First, the deep learning model itself is limited by being unable to act on cases with implanted metalwork (although our system is able to automatically identify these cases and exclude them from analysis). Second, the sample size of the MRMC study was limited by the availability of readers; we determined a total dataset of 400 cases (200 positive and 200 negative cases) was as many as we could reasonably expect the readers to review, and only five radiologists reviewed the cases under diagnostic conditions as defined in the local standards of practice.”  
    Validation and algorithmic audit of a deep learning system for the detection of proximal femoral fractures in patients in the emergency department: a diagnostic accuracy study
    Lauren Oakden-Rayner et al.
    www.thelancet.com/digital-health Published online April 5, 2022 https://doi.org/10.1016/S2589-7500(22)00004-8  
  • Objectives: To develop and validate machine learning models to distinguish between benign and malignant bone lesions and compare the performance to radiologists.
    Results: The best machine learning model was based on an artificial neural network (ANN) combining both radiomic and demographic information achieving 80% and 75% accuracy at 75% and 90% sensitivity with 0.79 and 0.90 AUC on the internal and external test set, respectively. In comparison, the radiology residents achieved 71% and 65% accuracy at 61% and 35% sensitivity while the radiologists specialized in musculoskeletal tumor imaging achieved an 84% and 83% accuracy at 90% and 81% sensitivity, respectively.  
    Conclusions: An ANN combining radiomic features and demographic information showed the best performance in distinguishing between benign and malignant bone lesions. The model showed lower accuracy compared to specialized radiologists, while accuracy was higher or similar compared to residents.  
    Development and evaluation of machine learning models based on X-ray radiomics for the classification and differentiation of malignant and benign bone tumors  
    Claudio E. von Schacky et al.
    European Radiology 2022https://doi.org/10.1007/s00330-022-08764-w 
  • “In this study, machine learning models based on radiomics and demographic information were developed and validated to distinguish between benign and malignant bone lesions on radiographs and compared to radiologists on an external test set. Overall, machine learning models using the combination of radiomics and demographic information showed a higher diagnostic accuracy than machine learning models using radiomics or demographic information only. The best model was based on an ANN that used both radiomics and demographic information. On an external test set, this model demonstrated lower accuracy compared to radiologists specialized in musculoskeletal tumor imaging, while accuracy was higher or similar compared to radiology residents.”
    Development and evaluation of machine learning models based on X-ray radiomics for the classification and differentiation of malignant and benign bone tumors  
    Claudio E. von Schacky et al.
    European Radiology 2022https://doi.org/10.1007/s00330-022-08764-w 
  • Results: The best machine learning model was based on an artificial neural network (ANN) combining both radiomic and demographic information achieving 80% and 75% accuracy at 75% and 90% sensitivity with 0.79 and 0.90 AUC on the internal and external test set, respectively. In comparison, the radiology residents achieved 71% and 65% accuracy at 61% and 35% sensitivity while the radiologists specialized in musculoskeletal tumor imaging achieved an 84% and 83% accuracy at 90% and 81% sensitivity, respectively.  
    Conclusions: An ANN combining radiomic features and demographic information showed the best performance in distinguishing between benign and malignant bone lesions. The model showed lower accuracy compared to specialized radiologists, while accuracy was higher or similar compared to residents.  
    Development and evaluation of machine learning models based on X-ray radiomics for the classification and differentiation of malignant and benign bone tumors  
    Claudio E. von Schacky et al.
    European Radiology 2022https://doi.org/10.1007/s00330-022-08764-w 
  • “In conclusion, a machine learning model using both radiomic features and demographic information was developed that showed high accuracy and discriminatory power for the distinction between benign and malignant bone tumors on radiographs of patients that underwent biopsy. The best model was based on an ANN that used both radiomics and demographic information resulting in an accuracy higher or similar compared to radiology residents. A model such as this may enhance diagnostic decision-making especially for radiologists or physicians with limited experience and may therefore improve the diagnostic work up of bone tumors.”  
    Development and evaluation of machine learning models based on X-ray radiomics for the classification and differentiation of malignant and benign bone tumors  
    Claudio E. von Schacky et al.
    European Radiology 2022https://doi.org/10.1007/s00330-022-08764-w 
  • Background: The interpretation of radiographs suffers from an ever-increasing workload in emergency and radiology departments, while missed fractures represent up to 80% of diagnostic errors in the emergency department.  
    Purpose: To assess the performance of an artificial intelligence (AI) system designed to aid radiologists and emergency physicians in the detection and localization of appendicular skeletal fractures.
    Conclusion: The artificial intelligence aid provided a gain of sensitivity (8.7% increase) and specificity (4.1% increase) without loss of reading speed.  
    Assessment of an AI Aid in Detection of Adult Appendicular Skeletal Fractures by Emergency Physicians and Radiologists: A Multicenter Cross-sectional Diagnostic Study  
    Loïc Duron et al.  
    Radiology 2021; 300:120–129 
  • Materials and Methods: The AI system was previously trained on 60 170 radiographs obtained in patients with trauma. The radio- graphs were randomly split into 70% training, 10% validation, and 20% test sets. Between 2016 and 2018, 600 adult patients in whom multiview radiographs had been obtained after a recent trauma, with or without one or more fractures of shoulder, arm, hand, pelvis, leg, and foot, were retrospectively included from 17 French medical centers. Radiographs with quality precluding hu- man interpretation or containing only obvious fractures were excluded. Six radiologists and six emergency physicians were asked to detect and localize fractures with (n = 300) and fractures without (n = 300) the aid of software highlighting boxes around AI- detected fractures. Aided and unaided sensitivity, specificity, and reading times were compared by means of paired Student t tests after averaging of performances of each reader.  
    Assessment of an AI Aid in Detection of Adult Appendicular Skeletal Fractures by Emergency Physicians and Radiologists: A Multicenter Cross-sectional Diagnostic Study  
    Loïc Duron et al.  
    Radiology 2021; 300:120–129 
  • Results: A total of 600 patients (mean age 6 standard deviation, 57 years 6 22; 358 women) were included. The AI aid improved the sensitivity of physicians by 8.7% (95% CI: 3.1, 14.2; P = .003 for superiority) and the specificity by 4.1% (95% CI: 0.5, 7.7; P < .001 for noninferiority) and reduced the average number of false-positive fractures per patient by 41.9% (95% CI: 12.8, 61.3; P = .02) in patients without fractures and the mean reading time by 15.0% (95% CI: 230.4, 3.8; P = .12). Finally, stand-alone perfor- mance of a newer release of the AI system was greater than that of all unaided readers, including skeletal expert radiologists, with an area under the receiver operating characteristic curve of 0.94 (95% CI: 0.92, 0.96).  
    Assessment of an AI Aid in Detection of Adult Appendicular Skeletal Fractures by Emergency Physicians and Radiologists: A Multicenter Cross-sectional Diagnostic Study  
    Loïc Duron et al.  
    Radiology 2021; 300:120–129 
  •  Results: A total of 600 patients (mean age 6 standard deviation, 57 years 6 22; 358 women) were included. The AI aid improved the sensitivity of physicians by 8.7% (95% CI: 3.1, 14.2; P = .003 for superiority) and the specificity by 4.1% (95% CI: 0.5, 7.7; P < .001 for noninferiority) and reduced the average number of false-positive fractures per patient by 41.9% (95% CI: 12.8, 61.3; P = .02) in patients without fractures and the mean reading time by 15.0% (95% CI: 230.4, 3.8; P = .12). Finally, stand-alone perfor- mance of a newer release of the AI system was greater than that of all unaided readers, including skeletal expert radiologists, with an area under the receiver operating characteristic curve of 0.94 (95% CI: 0.92, 0.96).  
    Conclusion: The artificial intelligence aid provided a gain of sensitivity (8.7% increase) and specificity (4.1% increase) without loss of reading speed.  
    Assessment of an AI Aid in Detection of Adult Appendicular Skeletal Fractures by Emergency Physicians and Radiologists: A Multicenter Cross-sectional Diagnostic Study  
    Loïc Duron et al.  
    Radiology 2021; 300:120–129  
  •  Summary  
    The artificial intelligence aid improved the sensitivity and specificity of radiologists and emergency physicians in the localization of appendicular fractures on radiographs, with no additional reading time.  
    Key Results  
    • The artificial intelligence (AI) aid, which highlighted potential fractures on full-resolution radiographs, improved the sensitivity (8.7% increase, P = .006) and specificity (4.1% increase, P = .03) of emergency doctors and radiologists in the diagnosis of appen- dicular fractures.  
    • The stand-alone area under the receiver operating characteristic curve, requiring that the AI system detect the precise locations of all fractures on an examination, was .94 with a newer release of the AI system.  
    Assessment of an AI Aid in Detection of Adult Appendicular Skeletal Fractures by Emergency Physicians and Radiologists: A Multicenter Cross-sectional Diagnostic Study  
    Loïc Duron et al.  
    Radiology 2021; 300:120–129 

  • Assessment of an AI Aid in Detection of Adult Appendicular Skeletal Fractures by Emergency Physicians and Radiologists: A Multicenter Cross-sectional Diagnostic Study  
    Loïc Duron et al.  
    Radiology 2021; 300:120–129 
  • "Our study had several limitations. First, readers and the AI sys- tem were assessed on their ability to make decisions based on image analysis alone, without knowledge about the findings from the patients’ physical examination or their medical history, creating a context bias. Clinical data can be crucial in making decisions; however, in our experience radiologists often lack relevant clinical data. Second, a Hawthorne effect may have affected the performances of readers, that is, a modification of their behavior in response to their awareness of being observed for the research project, leading, for instance, to a more thorough reading than in clinical practice. Similarly, cognitive biases related to the emer- gency setting could not be replicated in a retrospective study.”
    Assessment of an AI Aid in Detection of Adult Appendicular Skeletal Fractures by Emergency Physicians and Radiologists: A Multicenter Cross-sectional Diagnostic Study  
    Loïc Duron et al.  
    Radiology 2021; 300:120–129 
  • "In conclusion, we showed that a deep learning algorithm aided emergency physicians and radiologists in improving their diagnostic performance and boosting their time efficiency in the localization of all appendicular bone fractures on plain radiographs. The algorithm improved as updates were made, which bodes well for helping physicians cope with the increasing work- load more effectively, and an evaluation in future prospective studies will be needed.”
    Assessment of an AI Aid in Detection of Adult Appendicular Skeletal Fractures by Emergency Physicians and Radiologists: A Multicenter Cross-sectional Diagnostic Study  
    Loïc Duron et al.  
    Radiology 2021; 300:120–129 
  • “Artificial intelligence (AI) has the potential to affect every step of the radiology workflow, but the AI application that has received the most press in recent years is image interpretation, with numerous articles describing how AI can help detect and characterize abnormalities as well as monitor disease response. Many AI-based image interpretation tasks for musculoskeletal (MSK) pathologies have been studied, including the diagnosis of bone tumors, detection of osseous metastases, assessment of bone age, identification of fractures, and detection and grading of osteoarthritis. This article explores the applications of AI for image interpretation of MSK pathologies.”
    Pattern Recognition in Musculoskeletal Imaging Using Artificial Intelligence
    Natalia Gorelik, Jaron Chong, Dana J. Lin
    Semin Musculoskelel Radiol 2020;24:38–49.
  • “Arttificial intelligence (AI) has the potential to affect every step of the radiology workflow from ordering with clinical decision support, examination scheduling and protocoling, image acquisition and reconstruction, radiation dose estimation and reduction, quality control, optimization of automatic image display with hanging protocols, worklist management with prioritization of urgent or abnormal studies, integration of radiologic data with clinical data, quantitative image analy- sis, structured reporting, delivery of results to the referring location. selected and input to a ML classifier, like support vector physician, to billing and coding.”
    Pattern Recognition in Musculoskeletal Imaging Using Artificial Intelligence
    Natalia Gorelik, Jaron Chong, Dana J. Lin
    Semin Musculoskelel Radiol 2020;24:38–49.
  • “The proficiency of AI applications in pattern recognition holds great promise for improving patient care through achieving higher diagnostic accuracy, better predicting individual out- comes, and increasing radiologists’ efficiency, which is essential tial in light of the ever-increasing imaging volumes in both absolute number of examinations as well as the amount of data per study. In this article we reviewed how pattern recognition in MSK imaging using AI could facilitate the diagnosis of bone tumors, detection of bone metastases, evaluation of pediatric bone age, identification of fractures, labeling of images, and assessment of OA. Future research will no doubt further expand on the variety of MSK pathologies that can be addressed with AI-based solutions. As this field continues to evolve, radiology researchers, societies, and industry will collaborate to tackle the challenges ahead to improve radiololgy, technology, and patient care.”
    Pattern Recognition in Musculoskeletal Imaging Using Artificial Intelligence
    Natalia Gorelik, Jaron Chong, Dana J. Lin
    Semin Musculoskelel Radiol 2020;24:38–49.
  • “The advent of AI in radiology lends a quantitative lens to the imaging practice to create more value for the patient and the referring physicians. We anticipate that integration of AI tools with BI&A will continue to rise at a rapid pace, particularly as demands for quality and efficiency grow, and our imaging informatics infrastructure grows increasingly complex. Specifically, the most salient growth will depend on the guidance of national professional societies such as the ACR to align AI development along appropriate standards, and the fastest business AI development is likely to arise from the pressure points along various regulatory drivers such as merit-based incentive payments and APMs.”
    From Data to Value: How Artificial Intelligence Augments the Radiology Business to Create Value
    Teresa Martin-Carreras, Po-Hao Chen
    Semin Musculoskelet Radiol 2020;24:65–73.

  • From Data to Value: How Artificial Intelligence Augments the Radiology Business to Create Value
    Teresa Martin-Carreras, Po-Hao Chen
    Semin Musculoskelet Radiol 2020;24:65–73.
  • “AI implemented poorly risks pushing humanity to the margins; done wisely, AI can free up physicians’ cognitive and emotional space for patients, and shift the focus away from transactional tasks to personalized care. The challenge will be for humans to have the wisdom and willingness to discern AI’s optimal role in twenty-first century healthcare, and to determine when it strengthens and when it undermines human healing.”
    Ten Ways Artificial Intelligence Will Transform Primary Care
    Steven Y. Lin, Megan R. Mahoney, Christine A. Sinsky
    J Gen Intern Med 34(8):1626–30
  • “The use of AI has the potential to greatly enhance every component of the imaging value chain. From assessing the appropriateness of imaging orders to helping predict patients at risk for fracture, AI can increase the value that musculoskeletal imagers provide to their patients and to referring clinicians by improving image quality, patient centricity, imaging efficiency, and diagnostic accuracy.”
    Artificial Intelligence in Musculoskeletal Imaging: Current Status and Future Directions
    Gyftopoulos S et al.
    AJR 2019; 213:1–8
  • “Several studies have shown promising results of using ML to determine bone age. Using datasets from two separate chil- dren’s hospitals, Larson et al. found that their deep CNN was able to estimate skeletal maturity with accuracy comparable to that of an expert radiologist as well as to that of existing automated bone age software. Tajmir et al. showed that AI-assisted radiologist interpretation performed better than AI alone, a radiologist alone, or a pooled cohort of experts, by increasing accuracy and decreasing variability and the root-mean-square error. Their findings suggest that the most optimal use of AI for determination of bone age may be in combination with a radiologist’s interpretation.”
    Artificial Intelligence in Musculoskeletal Imaging: Current Status and Future Directions
    Gyftopoulos S et al.
    AJR 2019; 213:1–8
  • “The use of AI has the potential to greatly enhance every component of the imaging value chain. From assessing the appropriate- ness of imaging orders to helping predict patients at risk for fracture, AI can increase the value that MSK imagers provide to their patients and to referring clinicians by improving image quality, patient centricity, imaging efficiency, and diagnostic accuracy.”
    Artificial Intelligence in Musculoskeletal Imaging: Current Status and Future Directions
    Gyftopoulos S et al.
    AJR 2019; 213:1–8
  • “Radiomics is an emerging field in medicine that is based on the extraction of diverse quantitative characteristics from images and the use of these characteristics for data mining and pattern identification. These data can then be used with other patient information to better characterize and predict disease processes. ML techniques have led to a rapid expansion of the potential of radiomics to impact clinical care. For instance, the description of a sarcoma diagnosed on MRI will typically include estimates of tumor size, shape, and enhancement pattern. ML-driven algorithms can also identify and collect other characteristics that are not easily appreciated on images (e.g., texture analysis, image intensity histograms, and image voxel relationships) and can lead to more precise treatment.”
    Artificial Intelligence in Musculoskeletal Imaging: Current Status and Future Directions
    Gyftopoulos S et al.
    AJR 2019; 213:1–8

  • AI and MR of the Knee
  • Purpose: To investigate the feasibility of using a deep learning–based approach to detect an anterior cruciate ligament (ACL) tear within the knee joint at MRI by using arthroscopy as the reference standard.
    Results: The sensitivity and specificity of the ACL tear detection system at the optimal threshold were 0.96 and 0.96, respectively. In comparison, the sensitivity of the clinical radiologists ranged between 0.96 and 0.98, while the specificity ranged between 0.90 and 0.98. There was no statistically significant difference in diagnostic performance between the ACL tear detection system and clinical radiologists at P < .05. The area under the ROC curve for the ACL tear detection system was 0.98, indicating high overall diagnostic accuracy.
    Conclusion: There was no significant difference between the diagnostic performance of the ACL tear detection system and clinical radiologists for determining the presence or absence of an ACL tear at MRI.
    Fully Automated Diagnosis of Anterior Cruciate Ligament Tears on Knee MR Images by Using Deep Learning
    Fang Liu et al.
    Radiology: Artificial Intelligence 2019; 1(3):e180091 • https://doi.org/10.1148/ryai.2019180091
  • Purpose: To investigate the feasibility of using a deep learning–based approach to detect an anterior cruciate ligament (ACL) tear within the knee joint at MRI by using arthroscopy as the reference standard. 0.98, indicating high overall diagnostic accuracy.
    Conclusion: There was no significant difference between the diagnostic performance of the ACL tear detection system and clinical radiologists for determining the presence or absence of an ACL tear at MRI.
    Fully Automated Diagnosis of Anterior Cruciate Ligament Tears on Knee MR Images by Using Deep Learning
    Fang Liu et al.
    Radiology: Artificial Intelligence 2019; 1(3):e180091 • https://doi.org/10.1148/ryai.2019180091
  • Results: The sensitivity and specificity of the ACL tear detection system at the optimal threshold were 0.96 and 0.96, respectively. In comparison, the sensitivity of the clinical radiologists ranged between 0.96 and 0.98, while the specificity ranged between 0.90 and 0.98. There was no statistically significant difference in diagnostic performance between the ACL tear detection system and clinical radiologists at P < .05. The area under the ROC curve for the ACL tear detection system was 0.98, indicating high overall diagnostic accuracy.
    Conclusion: There was no significant difference between the diagnostic performance of the ACL tear detection system and clinical radiologists for determining the presence or absence of an ACL tear at MRI.
    Fully Automated Diagnosis of Anterior Cruciate Ligament Tears on Knee MR Images by Using Deep Learning
    Fang Liu et al.
    Radiology: Artificial Intelligence 2019; 1(3):e180091 • https://doi.org/10.1148/ryai.2019180091
  • Summary
    * There was no statistically significant difference between the anterior cruciate ligament (ACL) tear detection system and clinical radiologists with varying levels of experience for determining the presence or absence of a full-thickness ACL tear using sagittal proton density–weighted and fat-suppressed T2-weighted fast spin-echo MR images.
    Key Points
    * There was no significant difference between the diagnostic perfor mance of a fully automated deep learning–based diagnosis system and clinical radiologists for detecting a full-thickness anterior cruciate ligament (ACL) tear at MRI.
    * Sensitivity and specificity of the ACL tear detection system at the optimal threshold were 0.96 and 0.96, respectively; the sensitivity of the clinical radiologists ranged between 0.96 and 0.98 and specificity ranged between 0.90 and 0.98.
    Fully Automated Diagnosis of Anterior Cruciate Ligament Tears on Knee MR Images by Using Deep Learning
    Fang Liu et al.
    Radiology: Artificial Intelligence 2019; 1(3):e180091 • https://doi.org/10.1148/ryai.2019180091
  • Key Points
    * There was no significant difference between the diagnostic performance of a fully automated deep learning–based diagnosis system and clinical radiologists for detecting a full-thickness anterior cru- ciate ligament (ACL) tear at MRI.
    * Sensitivity and specificity of the ACL tear detection system at the optimal threshold were 0.96 and 0.96, respectively; the sensitivity of the clinical radiologists ranged between 0.96 and 0.98 and specificity ranged between 0.90 and 0.98.
    Fully Automated Diagnosis of Anterior Cruciate Ligament Tears on Knee MR Images by Using Deep Learning
    Fang Liu et al.
    Radiology: Artificial Intelligence 2019; 1(3):e180091 • https://doi.org/10.1148/ryai.2019180091

  • Fully Automated Diagnosis of Anterior Cruciate Ligament Tears on Knee MR Images by Using Deep Learning
    Fang Liu et al.
    Radiology: Artificial Intelligence 2019; 1(3):e180091 • https://doi.org/10.1148/ryai.2019180091
  • “This study showed that a deep learning model can be trained to detect wrist fractures in radiographs with diagnostic accuracy similar to that of senior subspecialized orthopedic surgeons. Additionally, this study showed that, when emergency medicine clinicians are provided with the assistance of the trained model, their ability to detect wrist fractures can be significantly improved, thus diminishing diagnostic errors and also improving the clinicians’ efficiency."
    Deep neural network improves fracture detection by clinicians
    Lindsey R et al.
    PNAS | November 6, 2018 | vol. 115 | no. 45 | 11591–11596
  • “The approach of this investigation is to apply machine learning algorithms trained by experts in the field to less experienced clinicians (who are at particular risk for diagnostic errors yet responsible for primary patient care and triage) to improve both their performance and efficiency. The learning model presented in this study mitigates these factors.”
    Deep neural network improves fracture detection by clinicians
    Lindsey R et al.
    PNAS | November 6, 2018 | vol. 115 | no. 45 | 11591–11596
  • “This study shows that deep learning models offer potential for subspecialized clinicians (without machine learning experience) to teach computers how to emulate their diagnostic expertise and thereby help patients on a global scale. Although teaching the model is a laborious process requiring collecting thousands of radiographs and carefully labeling them, making a prediction using the trained model takes less than a second on a modern computer.”
    Deep neural network improves fracture detection by clinicians
    Lindsey R et al.
    PNAS | November 6, 2018 | vol. 115 | no. 45 | 11591–11596
  • “ Historically, computer-assisted detection (CAD) in radiology has failed to achieve improvements in diagnostic accuracy, decreasing clinician sensitivity and leading to unnecessary further diagnostic tests. With the advent of deep learning approaches to CAD, there is great excitement about its appli- cation to medicine, yet there is little evidence demonstrating improved diagnostic accuracy in clinically-relevant applica- tions. We trained a deep learning model to detect fractures on radiographs with a diagnostic accuracy similar to that of senior subspecialized orthopedic surgeons. We demonstrate that when emergency medicine clinicians are provided with the assistance of the trained model, their ability to accurately detect fractures significantly improves.”
    Deep neural network improves fracture detection by clinicians
    Robert Lindsey et al.
    Proc Natl Acad Sci U S A. 2018 Nov 6;115(45):11591-11596
  • In this work, we developed a deep neural network to detect and localize fractures in radiographs. We trained it to accurately emulate the expertise of 18 senior sub- specialized orthopedic surgeons by having them annotate 135,409 radiographs. We then ran a controlled experiment with emergency medicine clinicians to evaluate their ability to detect fractures in wrist radiographs with and without the assistance of the deep learning model. The average clinician’s sensitivity was 80.8% (95% CI, 76.7–84.1%) unaided and 91.5% (95% CI, 89.3–92.9%) aided, and specificity was 87.5% (95 CI, 85.3–89.5%) unaided and 93.9% (95% CI, 92.9–94.9%) aided. The average clinician experienced a relative reduction in misinterpretation rate of 47.0% (95% CI, 37.4– 53.9%).
    Deep neural network improves fracture detection by clinicians
    Robert Lindsey et al.
    Proc Natl Acad Sci U S A. 2018 Nov 6;115(45):11591-11596
  • “The significant improvements in diagnostic accuracy that we observed in this study show that deep learning methods are a mechanism by which senior medical specialists can deliver their expertise to generalists on the front lines of medicine, thereby providing substantial improvements to patient care.”
    Deep neural network improves fracture detection by clinicians
    Robert Lindsey et al.
    Proc Natl Acad Sci U S A. 2018 Nov 6;115(45):11591-11596
  • Misinterpretation of radiographs may have grave consequences, resulting in complications including malunion with restricted range of motion, posttraumatic osteoarthritis, and joint collapse, the latter of which may require joint replacement. Misdiagnoses are also the primary cause of malpractice claims or litigation. There are multiple factors that can contribute to radiographic misinterpretations of fractures by clinicians, including physician fatigue, lack of subspecialized expertise, and inconsistency among reading physicians.
    Deep neural network improves fracture detection by clinicians
    Robert Lindsey et al.
    Proc Natl Acad Sci U S A. 2018 Nov 6;115(45):11591-11596
  • “The approach of this investigation is to apply machine learning algorithms trained by experts in the field to less experienced clinicians (who are at particular risk for diagnostic errors yet responsible for primary patient care and triage) to improve both their performance and efficiency. The learning model presented in this study mitigates these factors. It does not become fatigued, it always provides a consistent read, and it gains subspecialized expertise by being provided with labeled radiographs from human experts.”
    Deep neural network improves fracture detection by clinicians
    Robert Lindsey et al.
    Proc Natl Acad Sci U S A. 2018 Nov 6;115(45):11591-11596
  • Thus, we speculate that, someday, technology may permit any patient whose clinician has computer access to receive the same high-quality radiographic interpretations as those received by the patients of senior subspecialized experts.
    Deep neural network improves fracture detection by clinicians
    Robert Lindsey et al.
    Proc Natl Acad Sci U S A. 2018 Nov 6;115(45):11591-11596
  • “We have shown that radiological scores can be predicted to an excellent standard using only the disc-specific assessments as a reference set. The proposed method is quite general, and although we have implemented it here for sagittal T2 scans, it could easily be applied to T1 scans or axial scans, and for radiological features not studied here or indeed to any medical task where label/grading might be available only for a small region or a specific anatomy of an image. One benefit of automated reading is to produce a numerical signal score that would provide a scale of degeneration and so avoid an arbitrary categorization into artificial grades.”
    Automation of reading of radiological features from magnetic resonance images (MRIs) of the lumbar spine without human intervention is comparable with an expert radiologist
    Jamaludin A et al.
    Eur Spine J 2018; DOI 10.1007/s00586-017-4956-3
  • “Automation of radiological grading is now on par with human performance. The system can be beneficial in aiding clinical diagnoses in terms of objectivity of gradings and the speed of analysis. It can also draw the attention of a radiologist to regions of degradation. This objectivity and speed is an important stepping stone in the investigation of the relationship between MRIs and clinical diagnoses of back pain in large cohorts.”
    Automation of reading of radiological features from magnetic resonance images (MRIs) of the lumbar spine without human intervention is comparable with an expert radiologist
    Jamaludin A et al.
    Eur Spine J 2018; DOI 10.1007/s00586-017-4956-3
  • The process in a flow chart
  • One of the biggest potential bottlenecks that could inhibit or derail AI development and adoption in health care is the availability of sufficient quantities of high-quality data in standardized formats. As noted earlier, information today is highly fragmented and spread across the industry, residing in diverse, mostly uncoordinated repositories like electronic medical records, laboratory and imaging systems, physician notes, and health-insurance claims. Merging this information into large, integrated databases, which is required to empower AI to develop the deep understanding of diseases and their cures, is difficult.
    Artificial Intelligence- The Next Digital Frontier
    McKinsey Global Institute(2017)
  • FDA Statement
    The OsteoDetect software is a computer-aided detection and diagnostic software that uses an artificial intelligence algorithm to analyze two-dimensional X-ray images for signs of distal radius fracture, a common type of wrist fracture. The software marks the location of the fracture on the image to aid the provider in detection and diagnosis.
  • FDA Statement
    OsteoDetect analyzes wrist radiographs using machine learning techniques to identify and highlight regions of distal radius fracture during the review of posterior-anterior (front and back) and medial-lateral (sides) X-ray images of adult wrists. OsteoDetect is intended to be used by clinicians in various settings, including primary care, emergency medicine, urgent care and specialty care, such as orthopedics. It is an adjunct tool and is not intended to replace a clinician’s review of the radiograph or his or her clinical judgment.
  • FDA Approval Statement (AIDOC)
  • "Deep learning–based approaches have the potential to maximize diagnostic performance for detecting cartilage degeneration and acute cartilage injury within the knee joint while reducing subjectivity, variability, and errors due to distraction and fatigue associated with human interpretation."
    Deep Learning Approach for Evaluating Knee MR Images: Achieving High Diagnostic Performance for Cartilage Lesion Detection
    FangLiu et al.
    Radiology 2018 (in press)
  • Skeletal bone age assessment is a common clinical practice to investigate endocrinology, genetic and growth disorders in children. It is generally performed by radiological examination of the left hand by using either the Greulich and Pyle (G&P) method or the Tanner-Whitehouse (TW) one. However, both clinical procedures show several limitations, from the examination effort of radiologists to (most importantly) significant intra- and inter-operator variability. To address these problems, several automated approaches (especially relying on the TW method) have been proposed; nevertheless, none of them has been proved able to generalize to different races, age ranges and genders. In this paper, we propose and test several deep learning approaches to assess skeletal bone age automatically; the results showed an average discrepancy between manual and automatic evaluation of about 0.8 years, which is state-of-the-art performance.

    
Deep learning for automated skeletal bone age assessment in X-ray images.
Spampinato C  et al.
Med Image Anal. 2017 Feb;36:41-51

  • “In this paper, we propose and test several deep learning approaches to assess skeletal bone age automatically; the results showed an average discrepancy between manual and automatic evaluation of about 0.8 years, which is state-of-the-art performance. Furthermore, this is the first automated skeletal bone age assessment work tested on a public dataset and for all age ranges, races and genders, for which the source code is available, thus representing an exhaustive baseline for future research in the field. Beside the specific application scenario, this paper aims at providing answers to more general questions about deep learning on medical images: from the comparison between deep-learned features and manually-crafted ones, to the usage of deep-learning methods trained on general imagery for medical problems, to how to train a CNN with few images.”

    
Deep learning for automated skeletal bone age assessment in X-ray images.
  • “An automated machine learning computer system was created to detect, anatomically localize, and categorize vertebral compression fractures at high sensitivity and with a low false-positive rate, as well as to calculate vertebral bone density, on CT images.”
Vertebral Body Compression Fractures and Bone Density: Automated Detection and Classification on CT Images 
Burns JE et al.
Radiology (in press)
    “Sensitivity for detection or localization of compression fractures was 95.7% (201 of 210; 95% confidence interval [CI]: 87.0%, 98.9%), with a false-positive rate of 0.29 per patient. Additionally, sensitivity was 98.7% and specificity was 77.3% at case-based receiver operating characteristic curve analysis.”


    Vertebral Body Compression Fractures and Bone Density: Automated Detection and Classification on CT Images 
Burns JE et al.
Radiology (in press)
  • “This system performed with 95.7% sensitivity in fracture detection and lo- calization to the correct vertebral level, with a low false-positive rate. There was a high level of overall agreement (95%) for compression morphology and 68% overall agreement for severity categorization relative to radiologist classification.”


    Vertebral Body Compression Fractures and Bone Density: Automated Detection and Classification on CT Images 
Burns JE et al.
Radiology (in press)
  • *A fully automated machine learning software system with which to detect, localize, and classify compression fractures and determine the bone density of thoracic and lumbar vertebral bodies on CT images was developed and validated. 
* The computer system has a sensitivity of 95.7% in the detection of compression fractures and in the localization of these fractures to the correct vertebrae, with a false-positive rate of 0.29 per patient. 
* The accuracy of this computer system in fracture classification by Genant type was 95% (weighted k = 0.90). 


    Vertebral Body Compression Fractures and Bone Density: Automated Detection and Classification on CT Images 
Burns JE et al.
Radiology (in press)
  • “An automated machine learning computer system was created to detect, anatomically localize, and categorize vertebral compression fractures at high sensitivity and with a low false-positive rate, as well as to calculate vertebral bone density, on CT images.”


    Vertebral Body Compression Fractures and Bone Density: Automated Detection and Classification on CT Images 
Burns JE et al.
Radiology (in press)
  • “Sensitivity for detection or localization of compression fractures was 95.7% (201 of 210; 95% confidence interval [CI]: 87.0%, 98.9%), with a false-positive rate of 0.29 per patient. Additionally, sensitivity was 98.7% and specificity was 77.3% at case-based receiver operating characteristic curve analysis.”


    Vertebral Body Compression Fractures and Bone Density: Automated Detection and Classification on CT Images 
Burns JE et al.
Radiology (in press)
  • “This system performed with 95.7% sensitivity in fracture detection and lo- calization to the correct vertebral level, with a low false-positive rate. There was a high level of overall agreement (95%) for compression morphology and 68% overall agreement for severity categorization relative to radiologist classification. .”


    Vertebral Body Compression Fractures and Bone Density: Automated Detection and Classification on CT Images 
Burns JE et al.
Radiology (in press)
  • * A fully automated machine learning software system with which to detect, localize, and classify compression fractures and determine the bone density of thoracic and lumbar vertebral bodies on CT images was developed and validated. 

    * The computer system has a sensitivity of 95.7% in the detection of compression fractures and in the localization of these fractures to the correct vertebrae, with a false-positive rate of 0.29 per patient.
    
* The accuracy of this computer system in fracture classification by Genant type was 95% (weighted k = 0.90). 


    Vertebral Body Compression Fractures and Bone Density: Automated Detection and Classification on CT Images 
Burns JE et al.
Radiology (in press)

Privacy Policy

Copyright © 2022 The Johns Hopkins University, The Johns Hopkins Hospital, and The Johns Hopkins Health System Corporation. All rights reserved.
CTisus CT Scanning CTisus CT Scanning CTisus CT Scanning CTisus CT Scanning CTisus CT Scanning CTisus CT Scanning CTisus CT Scanning CTisus CT Scanning CTisus CT Scanning CTisus CT Scanning