google ads
Deep Learning: Chatgpt Imaging Pearls - Educational Tools | CT Scanning | CT Imaging | CT Scan Protocols - CTisus
Imaging Pearls ❯ Deep Learning ❯ ChatGPT

-- OR --

  • “Generative artificial intelligence models exhibit amazing abilities but make serious errors. We have a very limited understanding of why they work well at all or of the circumstances under which they give incorrect responses. This suggests the need for additional research and great caution in deploying such models for critical applications. Since the availability of ChatGPT in late 2022, based on OpenAI’s GPT 3.5 large language model, those of us who have explored its capabilities have been amazed by its facility with language and its abilities to generate coherent — and even insightful — synopses; answer questions about everything from general knowledge to domain-specific topics; offer advice on how to accomplish tasks, including for medical diagnosis, therapy, and prognosis; deduce consequences of assumptions; and even write effective computer programs. Nevertheless, I would urge great caution in adopting such methods in health care, mainly because of our lack of understanding of how they accomplish the miraculous-seeming things they are able to do.”
    Large Language Models Seem Miraculous, but Science Abhors Miracles
    Peter Szolovits
    NEJM AI 2024; 1 (6)
  • “Nevertheless, one can speculate about what might be happening inside LLMs. Data compression research shows that the most effective way to compress data and allow it to be regenerated (e.g., ZIP files, JPEG images, etc.) is to identify common patterns in the data. LLMs, using huge neural networks, can identify such patterns much better than earlier prespecified methods. So, perhaps LLMs may be learning multiple functions that can approximately reproduce their inputs, and these functions correspond to different facilities that the LLMs exhibit, such as relevant fact retrieval, inference, grammatical knowledge, writing style, etc. However, we do not know how to identify these facilities within the myriad trained weights in the network. To my knowledge, nobody has yet presented a coherent and insightful explanation for how the diverse capabilities of LLMs emerge.”
    Large Language Models Seem Miraculous, but Science Abhors Miracles
    Peter Szolovits
    NEJM AI 2024; 1 (6)
  • “Research on the interpretation of how large artificial neural networks operate may lead to insights about LLMs as well.5 Concern about the easy success of adversarial attacks on machine-learning systems is also leading to a new understanding about how robustness against such attacks likely leads to more correct models that are robust to internal errors as well. Work on explaining why machine-learning models come to a particular conclusion is challenging because that conclusion is likely to be the result of a vast number of numerical operations that would not be innately meaningful to a user. However, it is often possible to build simpler, easier-to-explain models that approximate the outputs of the complex model, at least in the neighborhood of the question posed. Perhaps such methods can provide good-enough explanations of complex outputs to be useful.”
    Large Language Models Seem Miraculous, but Science Abhors Miracles
    Peter Szolovits
    NEJM AI 2024; 1 (6)
  • “There is considerable policy discussion in government about the need for regulation of AI systems revolving around the issues discussed here. My personal view is that using generative AI is sensible as long as it is done under careful human supervision. Good uses include cleaningup or suggesting appropriate language, summarizing long texts, or using it in tasks such as cohort selection, where strict accuracy is desirable but errors are tolerable. I would be much more cautious in using such systems where critical decisions that affect specific individuals’ diagnosis and treatment are at stake and where errors can be very harmful. Moreover, I would certainly avoid situations in which a system is allowed to make consequential autonomous decisions. Because of our lack of understanding of how such systems work, caution is required, and we need to make much greater investments in understanding the “miracle” of generative AI and not just its exploitation.”
    Large Language Models Seem Miraculous, but Science Abhors Miracles
    Peter Szolovits
    NEJM AI 2024; 1 (6)
  • The integration of artificial intelligence (AI) in medical image interpretation requires effective collaboration between clinicians and AI algorithms. Although previous studies demonstrated the potential of AI assistance in improving overall clinician performance, the individual impact on clinicians remains unclear. This large-scale study examined the heterogeneous effects of AI assistance on 140 radiologists across 15 chest X-ray diagnostic tasks and identified predictors of these effects. Surprisingly, conventional experience-based factors, such as years of experience, subspecialty and familiarity with AI tools, fail to reliably predict the impact of AI assistance. Additionally, lower-performing radiologists do not consistently benefit more from AI assistance, challenging prevailing assumptions. Instead, we found that the occurrence of AI errors strongly influences treatment outcomes, with inaccurate AI predictions adversely affecting radiologist performance on the aggregate of all pathologies and on half of the individual pathologies investigated. Our findings highlight the importance of personalized approaches to clinician–AI collaboration and the importance of accurate AI models. By understanding the factors that shape the effectiveness of AI assistance, this study provides valuable insights for targeted implementation of AI, enabling maximum benefits for individual clinicians in clinical practice.
    Heterogeneity and predictors of the effects of AI assistance on radiologists
    Feiyang Yu et al.
    Nature Medicine | Volume 30 | March 2024 | 837–849 837
  • “The results revealed that the regression coefficient on unassisted error was significantly different from zero when considering all pathologies aggregated (P < 0.001; Fig. 3a and Supplementary Table 9). Similarly, the regression coefficient was significant on most individual pathologies (Benjamini–Hochberg-adjusted P < 0.05), except for atelectasis, pneumothorax and shoulder fracture (Benjamini– Hochberg-adjusted P > 0.05; Fig. 3b). We similarly constructed a linear regression model regressing from unassisted AUROC and an intercept term to assisted AUROC. The regression coefficient on unassisted AUROC was again significant on all pathologies aggregated (P < 0.001; Fig. 3c and Supplementary Table 33), whereas the coefficient was insignificant on abnormal (P > 0.05). Together, these findings indicate that unassisted error serves as a strong predictor of assisted error in most cases.”
    Heterogeneity and predictors of the effects of AI assistance on radiologists
    Feiyang Yu et al.
    Nature Medicine | Volume 30 | March 2024 | 837–849 837
  • “We found that experience-based radiologist characteristics, including years of experience, subspecialty in thoracic radiology and experience with AI tools, did not serve as reliable predictors of treatment effect, in terms of both calibration performance and discrimination performance. These findings challenge the associations between experience-based radiologist characteristics and the treatment effect of AI assistance reported in previous research24–28. The observed variability could be attributed to our larger and more diverse sample size, encompassing 140 radiologists with varying skill levels, experiences and preferences. Additionally, our study’s inclusion of a wide range of diagnostic tasks enables a robust examination of the complex factors influencing the treatment effect. Furthermore, the performance characteristics and quality of the specific AI system may play an important role, highlighting the need for developers to consider these factors when deploying AI assistance. To optimize the implementation of AI assistance, a comprehensive assessment of multiple factors, including the clinical task, patient population and AI system, is essential.”
    Heterogeneity and predictors of the effects of AI assistance on radiologists
    Feiyang Yu et al.
    Nature Medicine | Volume 30 | March 2024 | 837–849 837

  • Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models. 
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP. 
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print. PMID: 38220038.
  •  “Generative artificial intelligence (AI), specifically the large language models (LLMs) that underlie impressive new applications such as ChatGPT, are already fundamentally changing medicine. Unlike more traditional AI systems that produce simple outputs such as a number (say, the predicted length of stay for a patient in the hospital) or a category (say, “malignant” or “benign” for a radiologic system), “generative AI” refers broadly to systems whose outputs take the form of more unstructured media objects, such as images and documents. Under the hood, many of these systems are actually built by executing models that serve a more classical purpose. Generative text models, for example, generate whole documents by iteratively predicting “what word comes next.” But the ability to produce a whole document with desired properties unlocks a host of exciting applications.”
    Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.  
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print. PMID: 38220038
  • “We build and leverage generative AI to unburden clinicians from clerical work to help them focus more fully on their patients. Briefly, Abridge AI helps clinicians capture conversations, structure and summarize the information exchanged, and integrate that information back into the medical record. As clinicians, we are tasked to serve three constituents for patients we see. Those constituents are (1) our care team members, who benefit from clinical notes that convey our thought process; (2) ourselves, as the physicians who also need to place orders, report diagnostic codes, and handle procedure codes for billing and revenue cycle; and (3) our patients, the most important constituents, who benefit from visit summaries and access to their OpenNotes in their portals.”
    Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.  
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print. PMID: 38220038
  • “The arrival of LLMs that can provide real-time assistance to physicians may allow a remarkable increase in their bandwidth, regardless of specialty. In radiology, leveraging these emerging technologies will potentially allow more scans to be read without adding burden or stress to the interpreting radiologist. We would be in the “high consequences for factual inaccuracies and high volume of decisions” quadrant of Figure 1, where the assistance of an LLM would be its key feature. The importance of that added bandwidth would be its potential to ameliorate disparities by democratizing the expertise of radiologists who are already able to handle large volumes or who may have special skill sets in less common examinations. Such improvements in access to care would hopefully have downstream effects of improved outcomes in marginalized populations.”
    Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.  
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print. PMID: 38220038
  • “Large language models summarizing clinical data promise powerful opportunities to streamline information-gathering from the EHR. But by dealing in language, they also bring unique risks that are not clearly covered by existing FDA regulatory safeguards. As summarization tools speed closer to clinical practice, transparent development of standards for LLM-generated clinical summaries, paired with pragmatic clinical studies, will be critical to the safe and prudent rollout of these technologies. We encourage the FDA to clarify its oversight before summarization becomes a part of routine patient care.”
    AI-Generated Clinical Summaries Require More Than Accuracy.
    Goodman KE, Yi PH, Morgan DJ.  
    JAMA. 2024 Jan 29. doi: 10.1001/jama.2024.0555. Epub ahead of print. PMID: 38285439. 
  • First, we need comprehensive standards for LLM-generated summaries, with domains that extend beyond accuracy and that include stress-testing for sycophancy and small but clinically important errors. These standards should reflect scientific and clinical consensus, with input beyond the few large technology companies developing health care LLMs. Second, LLMs performing clinical summarization are ultimately clinical aids. Regardless of current FDA regulation, we believe that they should be clinically tested to quantify clinical harms and benefits before widespread deployment.This testing carries minimal risk and could be performed as quality improvement in a learning health system.
    AI-Generated Clinical Summaries Require More Than Accuracy.
    Goodman KE, Yi PH, Morgan DJ.  
    JAMA. 2024 Jan 29. doi: 10.1001/jama.2024.0555. Epub ahead of print. PMID: 38285439.
  • “Yet there is also the potential for patient harm because LLMs performing summarization are unlikely to fall under FDA medical device over sight and could reach clinics without safety and efficacy safeguards. Indeed,FDA final guidance for clinical decision support software— published 2 months before ChatGPT’s release— provides an unintentional “roadmap” for how LLMs could avoid FDA regulation. Even LLMs performing sophisticated summarization tasks would not clearly qualify as devices because they provide general language-based outputs rather than specific predictions or numeric estimates of disease. With careful implementation, we expect that many LLMs summarizing clinical data could meet device-exemption criteria.”
    AI-Generated Clinical Summaries Require More Than Accuracy.
    Goodman KE, Yi PH, Morgan DJ. 
    JAMA. 2024 Jan 29. doi: 10.1001/jama.2024.0555. Epub ahead of print. PMID: 38285439.
  • “Informed consent is integral to the practice of medicine. Most informed consent documents are written at a reading level that surpasses the reading comprehension level of the average American. Large language models, a type of artificial intelligence (AI) with the ability to summarize and revise content, present a novel opportunity to make the language used in consent forms more accessible to the average American and thus, improve the quality of informed consent. In this study, we present the experience of the largest health care system in the state of Rhode Island in implementing AI to improve the read ability of informed consent documents, highlighting one tangible application for emerging AI in the clinical setting.”
    Using ChatGPT to Facilitate Truly Informed Medical Consent
    Fatima N. Mirza et al.
    NEJM AI 2024; 1 (2)
  • “Several learning points can be gleaned from our experience using AI to improve informed consent at the largest health care system in Rhode Island. First, the process of text simplification is a task that plays to the natural “styletransfer” strengths of LLMs, which are trained on a large text corpus of heterogeneous writing styles and reading levels. For a task that may take a proofreader hours or days, GPT-4 simplified a 947-word document in less than 1 minute.”
    Using ChatGPT to Facilitate Truly Informed Medical Consent
    Fatima N. Mirza et al.
    NEJM AI 2024; 1 (2)
  • “Our experience highlights a tangible way that AI can be deployed in the clinical setting to improve patient experiences in the here and now. Whereas the introduction of any new clinical technology presents the potential hazard of amplifying existing disparities in patient care, the reverse can also hold true. In the mission of creating a truly informed consent process, AI has the potential to broaden and strengthen our communication to reach countless vulnerable patients stranded at the periphery in a status quo mired by excessively arcane and verbose medical forms.”
    Using ChatGPT to Facilitate Truly Informed Medical Consent
    Fatima N. Mirza et al.
    NEJM AI 2024; 1 (2)
  • “Generative artificial intelligence (AI), specifically the large language models (LLMs) that underlie impressive new applications such as ChatGPT, are already fundamentally changing medicine. Unlike more traditional AI systems that produce simple outputs such as a number (say, the predicted length of stay for a patient in the hospital) or a category (say, “malignant” or “benign” for a radiologic system), “generative AI” refers broadly to systems whose outputs take the form of more unstructured media objects, such as images and documents. Under the hood, many of these systems are actually built by executing models that serve a more classical purpose. Generative text models, for example, generate whole documents by iteratively predicting “what word comes next.” But the ability to produce a whole document with desired properties unlocks a host of exciting applications.”
    Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print
  • “More interesting now, after the heights of the pandemic, we are beginning to see a market need for technology-driven efficiencies to help health care systems deal with historic losses over the past couple of years. For example, staffing shortages are top cost drivers and a concern for hospital executives. Strategies that may have worked in the past, raising prices via market mergers and consolidation or asking clinicians to see more patients and boost volumes, are unlikely to work this time given market dynamics. Instead, the new name of the game for health systems is productivity increase but with a twist. They need to improve care delivery experiences and outcomes while also improving efficiencies. Most important, they need to meet the health care demand without increasing the exodus of frontline health care workers.”
    Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print
  • “As clinicians, we are tasked to serve three constituents for patients we see. Those constituents are (1) our care team members, who benefit from clinical notes that convey our thought process; (2) ourselves, as the physicians who also need to place orders, report diagnostic codes, and handle procedure codes for billing and revenue cycle; and (3) our patients, the most important constituents, who benefit from visit summaries and access to their OpenNotes in their portals.” 
    Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print

  • Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print
  • “There is broad recognition that the burnout-inducing amount of paperwork associated with modern medical practice needs to be addressed to limit the early departure of clinicians from the field. Amrecent report in the Journal of General Internal Medicine suggested that doctorsmneed 27 hours a day to complete all their work . An AMA study from 2021msuggested that 63% of physicians surveyed reported burnout . Expenses of $4.6 billion annually were related to physician turnover and reduced clinical hours in 2019. All of these statistics have only worsened on the other side of the (peak) pandemic. Our priority is to assist and integrate for the depth of the workflow in an enterprise way, spanning the work that happens before, during, and after a patient encounter from notes to orders and coding.”
    Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print
  • “The arrival of LLMs that can provide real-time assistance to physicians may allow a remarkable increase in their bandwidth, regardless of specialty. In radiology, leveraging these emerging technologies will potentially allow more scans to be read without adding burden or stress to the interpreting radiologist. We would be in the “high consequences for factual inaccuracies and high volume of decisions” quadrant of Figure 1, where the assistance of an LLM would be its key feature. The importance of that added bandwidth would be its potential to ameliorate disparities by democratizing the expertise of radiologists who are already able to handle large volumes or who may have special skill sets in less common examinations. Such improvements in access to care would hopefully havedownstream effects of improved outcomes in marginalized populations.”
    Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print
  • “Perhaps the most important lesson for radiologists is that we need to have a seat at the table as LLMs are adopted more broadly as assistants and augmenters. We can help drive the maximum value for ourselves and, most important, our patients from those emerging technologies.”  
    Improving Efficiencies While Also Delivering Better Health Care Outcomes: A Role for Large Language Models.
    Rao SK, Fishman EK, Rizk RC, Chu LC, Rowe SP.  
    J Am Coll Radiol. 2024 Jan 12:S1546-1440(24)00005-X. doi: 10.1016/j.jacr.2024.01.003. Epub ahead of print
  • Application: Prior health AI applications often centered around diagnostics, but thereare many low-risk yet high-value scenarios across the entire health system that are ripe for impact
    Evaluation: Real-world use cases are often under-represented in existing health AI benchmarks; scaling realistic benchmark creation and evaluation is of increasing urgency
    Modeling: Unlike standard contrastive learning, multimodal generative AI can benefit from gravitating in text as the “interlingua” of all modalities, given the vast amount of human knowledge captured in state-of-the-art large language models.
    Multimodal Generative AI for Precision Health
    Hoifung Poon
    NEJM AI (in press)
  • We assessed the performance of the newly released AI GPT-4 in diagnosing complex medical case challenges and compared the success rate to that of medical-journal readers. GPT-4 correctly diagnosed 57% of cases, outperforming 99.98% of simulated human readers generated from online answers. We highlight the potential for AI to be a powerful supportive tool for diagnosis; however, further improvements, validation, and addressing of ethical considerations are needed before clinical implementation.
    Use of GPT-4 to Diagnose Complex Clinical Cases
    Alexander V. Eriksen , M.D.,1,2 Soren Moller , M.Sc., Ph.D.,3,4 and Jesper Ryg , M.D., Ph.D.1,2
    NEJM AI 2023; 1 (1)
  • “An important study limitation is the use of a poorly characterized population of human journal readers with unknown levels of medical skills. Moreover, we cannot assess whether the responses provided for the clinical cases reflect their maximum effort. Consequently, our results may represent a best-case scenario in favor of GPT-4. The assumption of independent answers on the 38 cases in our pseudo population is somewhat unrealistic, because some readers might consistently perform differently from others and the frequency at which participants respond correctly to the cases might depend on the level of medical skills as well as the distribution of these. However, even in the extreme case of maximally correlated correct answers among the medical-journal readers, GPT- 4 would still perform better than 72% of human readers.”
    Use of GPT-4 to Diagnose Complex Clinical Cases
    Alexander V. Eriksen , M.D.,1,2 Soren Moller , M.Sc., Ph.D.,3,4 and Jesper Ryg , M.D., Ph.D.1,2
    NEJM AI 2023; 1 (1)
  • “Additionally, whereas GPT-4 in our study worked only on written records, future AI tools that are more specialized are expected to include other data sources, including medical imaging and structured numerical measurements, in their predictions. Importantly, future models should include training data from developing countries to ensure a broad, global benefit of this technology and reduce the potential for health care disparities. AI based on LLMs might be relevant not only for in-patient hospital settings but also for first-line screening that is performed either in general practice or by patients themselves. As we move toward this future, the ethical implications surrounding the lack of transparency by commercial models such as GPT-4 also need to be addressed,1 as well as regulatory issues on data protection and privacy. Finally, clinical studies evaluating accuracy, safety, and validity should precede future implementation.”
    Use of GPT-4 to Diagnose Complex Clinical Cases
    Alexander V. Eriksen , M.D.,1,2 Soren Moller , M.Sc., Ph.D.,3,4 and Jesper Ryg , M.D., Ph.D.1,2
    NEJM AI 2023; 1 (1)
  • “Overall, GPT-4 has potential clinical use in older patients without a definitive clinical diagnosis after 1 month but requires comprehensive entry of demographic and clinical (including radiological and pharmacological) information. GPT-4 may increase confidence in diagnosis and earlier commencement of appropriate treatment, alert clinicians missing important diagnoses, and offer suggestions similar to specialists to achieve the correct clinical diagnosis, which has potential value in low-income countries with lack of specialist care. Clinicians need to be aware that GPT-4 is limited in multifocal infection, and the suggested management plan should be correlated with clinical context, as suggestions may be redundant. Clinicians should consider a drug review and review the possible diagnosis of malignant disease if suggested.”
    Use of GPT-4 to Analyze Medical Records of Patients With Extensive Investigations and Delayed Diagnosis
    Yat-Fung Shea et al.
    JAMA Network Open. 2023;6(8):e2325000
  • “This study has several limitations. First, GPT-4 may not detect 2 focuses of infection or pinpoint the source of recurrent infection. Second, GPT-4 did not suggest the use of gallium scan or 18-fluorodeoxyglucose positron emission tomography to look for infections or malignant neoplasms in all but 1 patient. Third, some investigations may not be appropriate (eg, temporal artery biopsy in the absence of typical symptoms of giant cell arteritis). Overall, our findings suggest that the use of AI in diagnosis is both promising and challenging.”
    Use of GPT-4 to Analyze Medical Records of Patients With Extensive Investigations and Delayed Diagnosis
    Yat-Fung Shea et al.
    JAMA Network Open. 2023;6(8):e2325000
  • “Radiologists have experienced rapid and far-reaching technological changes in their practices. Whether it is computerized tomography, where the conversation focused on number of detectors (64, 128, 256, 512), or transformation of numbers of X-ray tubes in a scanner (1 or 2) into a whole new technology with photon scanning, or magnetic resonance where the arguments have progressed from 1.5 to 3 Tesla scanners to 7 Tesla units, the technology is transformed quickly. Radiologists and their patients have benefited from change and celebrate it. But those changes took years, and the end-users of those technological changes generally watched its development from afar. Even the integration of Artificial Intelligence (AI) into clinical practice has developed gradually, from the theoretical use of AI to help with image interpretation or workflow management into frequent, integrated use in both private practice and academic practice has become standard clinical practice, whether for pulmonary embolism detection, analysis of mammograms, or triage of an emergency room reading list.”
    Watching Innovation in Real Time: The Story of ChatGPT and Radiology.  
    Fishman EK, Weeks WB, Lavista Ferres JM, Chu LC.  
    Can Assoc Radiol J. 2023 May 3:8465371231174817. doi: 10.1177/08465371231174817. Epub ahead of print. PMID: 37138372.
  • “Some of the explanation for these improvements in performance may be explained by the fact that, to date, ChatGPT has been trained only on publicly available information and not specifically on medicine. As versions of large language models are trained on medical books, articles, journals, and guidelines, the power of large language models in medicine will become even stronger. That ChatGPT can pass medical licensing exams is especially impressive when one realizes that the systems were not specifically trained for these tasks and that the available information was often limited as, for example, they could not look at material behind a paywall. We are beginning to realize that if properly guided, the power of this new technology will redefine medicine.3,4 The key then will be to use this technology to strengthen the doctor-patient relationship alive rather than replacing it. Clinicians need to guide the technology into practice.”
    Watching Innovation in Real Time: The Story of ChatGPT and Radiology.  
    Fishman EK, Weeks WB, Lavista Ferres JM, Chu LC.  
    Can Assoc Radiol J. 2023 May 3:8465371231174817. doi: 10.1177/08465371231174817. Epub ahead of print. PMID: 37138372.
  • “The speed of change in the world of technology has never been as rapid as it is today. For clinicians - as well as clinical researchers - this can translate to both the best of times and the worst of times. While clinicians and clinical researchers become excited by the development of a new computer chip, CT scanner, or vaccine, they know that time will pass before the new technology impacts our daily workflow. That is, until the introduction of large language models like ChatGPT, an offering that allows us to use the technology as it is being developed. This technology – which has been in development for decades - has the chance to impact everything we do at work or at home. And advancements in this technology, fostered by incredible computing power and amounts of data available for learning, occur so quickly that before you can write and publish an article, your descriptions may be one to two versions behind. The story ChatGPT is not one only of imagining what could be but also seeing the change in near real time.”  
    Watching Innovation in Real Time: The Story of ChatGPT and Radiology.  
    Fishman EK, Weeks WB, Lavista Ferres JM, Chu LC.  
    Can Assoc Radiol J. 2023 May 3:8465371231174817. doi: 10.1177/08465371231174817. Epub ahead of print. PMID: 37138372.
  • “In a recent editorial, Microsoft's president Brad Smith highlights that, like any other technology in history, AI will be used both as a tool and a weapon. In the fields of medicine and research, large language models can be intentionally misused to create scientific-sounding papers containing false information, which can easily spread on the internet. As some individuals are advocating for a shift away from traditional peer-review processes in medical journals, we must emphasize that the human peer-review process is more vital and critical than ever. Further, in what may be an arms race, reviewers and editors must make use of tools such as ChatGPT to uncover unethical activities on the part of authors.”
    Beyond chatting: The opportunities and challenges of ChatGPT in medicine and radiology
    Juan M. Lavista Ferres, William B. Weeks, Linda C. Chu, Steven P. Rowe, Elliot K. Fishman
    Diagnostic and Interventional Imaging, Volume 104, Issue 6,2023, Pages 263-264,
  • “AI will undoubtedly have a profound impact across all aspects of civilization. While we should embrace its benefits, we must also be aware of its potential drawbacks and limitations. With every step forward, there are bound to be missteps, and the road ahead will likely be full of surprises. Technology will continue to challenge, motivate, and occasionally even frighten us, but ultimately it has the potential to enhance our lives in countless ways.”
    Beyond chatting: The opportunities and challenges of ChatGPT in medicine and radiology
    Juan M. Lavista Ferres, William B. Weeks, Linda C. Chu, Steven P. Rowe, Elliot K. Fishman
    Diagnostic and Interventional Imaging, Volume 104, Issue 6,2023, Pages 263-264,
  • “Looking toward the future, AI systems will undoubtedly continue to advance and evolve [3]. As with any tool, it is essential to understand where it can contribute and in what scenarios it will not. When Tim Berners-Lee created the World Wide Web (WWW) in the 1990s, people, for the first time, had access to information that was not previously available. As we have come to learn, the web also is full of inaccuracies and misinformation. As a society, we are still working to help humans deal with this, but this was not a reason to throw out the WWW.”
    Beyond chatting: The opportunities and challenges of ChatGPT in medicine and radiology
    Juan M. Lavista Ferres, William B. Weeks, Linda C. Chu, Steven P. Rowe, Elliot K. Fishman
    Diagnostic and Interventional Imaging, Volume 104, Issue 6,2023, Pages 263-264,
  • IMPORTANCE Consumers are increasingly using artificial intelligence (AI) chatbots as a source of information. However, the quality of the cancer information generated by these chatbots has not yet been evaluated using validated instruments.
    OBJECTIVE To characterize the quality of information and presence of misinformation about skin, lung, breast, colorectal, and prostate cancers generated by 4 AI chatbots.
    CONCLUSIONS AND RELEVANCE Findings of this cross-sectional study suggest that AI chatbots generally produce accurate information for the top cancer-related search queries, but the responses are not readily actionable and are written at a college reading level. These limitations suggest that AI chatbots should be used supplementarily and not as a primary source for medical information.
    Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer
    Alexander Pan et al.
    JAMA Oncol. doi:10.1001/jamaoncol.2023.2947
  • “The primary outcomeswere the quality of consumer health information based on the validated DISCERN instrument (scores from 1 [low] to 5 [high] for quality of information) and the understandability and actionability of this information based on the understandability and actionability domains of the Patient Education Materials Assessment Tool (PEMAT) (scores of 0%-100%, with higher scores indicating a higher level of understandability and actionability). Secondary outcomes included misinformation scored using a 5-item Likert scale (scores from 1 [no misinformation] to 5 [high misinformation]) and readability assessed using the Flesch-Kincaid Grade Level readability score.”
    Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer
    Alexander Pan et al.
    JAMA Oncol. doi:10.1001/jamaoncol.2023.2947
  • Key Points
    Question What is the quality of cancer-related health information outputted by artificial intelligence (AI) chatbots?
    Findings In this cross-sectional study, the responses of 4 AI chatbots to the top search queries related to the 5 most prevalent cancers were high quality but were written at a college reading level and had poor actionability.
    Meaning Findings of this study suggest that AI chatbots are an accurate and reliable supplementary resource for medical information but are limited in their readability and should not replace health care professionals for individualized health care questions.  
    Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer
    Alexander Pan et al.
    JAMA Oncol. doi:10.1001/jamaoncol.2023.2947
  • ”Artificial intelligence chatbots are becoming a major source of medical information for consumers. Findings of this crosssectional study suggest that they generally produce reliable and accurate medical information about lung, breast, colorectal, skin, and prostate cancers. However, the usefulness of the information is limited by its poor readability and lack of visual aids. These limitations suggest that AI chatbots should be used supplementarily and not as a primary source for medical information. To this end, AI chatbots typically encourage users to seek medical attention relating to cancer symptoms and treatment.”  
    Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer
    Alexander Pan et al.
    JAMA Oncol. doi:10.1001/jamaoncol.2023.2947
  • Large language models (LLMs) can respond to free-text queries without being specifically trained in the task in question, causing excitement and concern about their use in healthcare settings. ChatGPT is a generative artificial intelligence (AI) chatbot produced through sophisticated fine-tuning of an LLM, and other tools are emerging through similar developmental processes. Here we outline how LLM applications such as ChatGPT are developed, and we discuss how they are being leveraged in clinical settings. We consider the strengths and limitations of LLMs and their potential to improve the efficiency and effectiveness of clinical, educational and research work in medicine.
    Large language models in medicine
    Arun James Thirunavukarasu et al.
    Nature Medicine https://doi.org/10.1038/s41591-023-02448-8 
  • ChatGPT (OpenAI) is an LLM chatbot: a generative AI application that now produces text in response to multimodal input (having previously accepted only text input). Its backend LLM is Generative Pretrained Transformer 3.5 or 4 (GPT-3.5 or GPT-4), described below. ChatGPT’s impact stems from its conversational interactivity and near-human-level or equal-to-human-level performance in cognitive tasks across fields, including medicine. ChatGPT has attained passing-level performance in United States Medical Licensing Examinations, and there have been suggestions that LLM applications may be ready for use in clinical, educational or research settings.
    Large language models in medicine
    Arun James Thirunavukarasu et al.
    Nature Medicine https://doi.org/10.1038/s41591-023-02448-8 
  • Deep learning: a variant of machine learning involving neural networks with multiple layers of processing ‘perceptrons’ (nodes), which together facilitate extraction of higher features of unstructured input data (for example, images, video and text).  
    Generative artificial intelligence: computational systems capable of producing content, such as text, images or sound, on demand.
     Large language model: a type of AI model using deep neural networks to learn the relationships between words in natural language, using large datasets of text to train.  
    Machine learning: a field of AI featuring models that enable computers to learn and make predictions based on input data, learning from experience.  
    Natural language processing: a field of AI research focusing on the interaction between computers and human language.
    Neural network: computing systems inspired by biological neural networks, comprising ‘perceptrons’ (nodes), usually arranged in layers, communicating with one another and performing transformations upon input data.
  • “The release of ChatGPT by OpenAI in November 2022 marked a pivotal moment in the introduction of a new generation of artificial intelligence (AI) to the broader public. The “GPT” in ChatGPT stands for “generative pre-trained transformer” and is a language model trained to generate realistic- and natural-sounding language. ChatGPT can generate coherent, grammatically correct text, which represents a significant step forward in AI development and has thepotential to revolutionize a range of industries, including medicine and radiology.”
    Beyond chatting: The opportunities and challenges of ChatGPT in medicine and radiology.  
    Ferres JML, Weeks WB, Chu LC, Rowe SP, Fishman EK.  
    Diagn Interv Imaging. 2023 104 (2023) 263−264 
  •  “In the late 1970s, Steve Jobs found inspiration in a scientific American paper that examined the efficiency of locomotion across various species. Although humans were not ranked as the most efficient species, a human on a bicycle was twice as efficient as the most efficient animal, the condor. Jobs used this analogy to highlight how humans can create tools that enhance their capabilities. Fast-forward to today and large language models powered by AI are akin to bicycles, a new addition to our toolbox that can boost our ability to learn, work, and create a positive impact on the planet. Nevertheless, it is essential to acknowledge that AI has its limitations and, at least for now, must be used in tandem with human ingenuity and decision- making.”
    Beyond chatting: The opportunities and challenges of ChatGPT in medicine and radiology.  
    Ferres JML, Weeks WB, Chu LC, Rowe SP, Fishman EK.  
    Diagn Interv Imaging. 2023 104 (2023) 263−264 
  • “AI is not a new technology: it has been in development for decades and has already had a significant impact on various industries. For example, over the last decade, the use of AI in medical imaging has increased productivity and accuracy. It is important to understand that AI is not infallible. For the time being, even more advanced AI systems, such as ChatGPT, are reactive and do not plan or reason; they may make mistakes or generate inaccurate information, which can be mitigated but likely not completely avoided.”
    Beyond chatting: The opportunities and challenges of ChatGPT in medicine and radiology.  
    Ferres JML, Weeks WB, Chu LC, Rowe SP, Fishman EK.  
    Diagn Interv Imaging. 2023 104 (2023) 263−264 
  •  “Although the primary emphasis of large language models has been on their chat capabilities, these models are transformative in the field of natural language processing. Much of medical knowledge is captured in unstructured text, ranging from doctors’ notes and diagnoses to research papers. GPT-type models can revolutionize our capacity to transform these data into valuable information, providing researchers with an enormous tool to discover new medical breakthroughs.”
    Beyond chatting: The opportunities and challenges of ChatGPT in medicine and radiology.  
    Ferres JML, Weeks WB, Chu LC, Rowe SP, Fishman EK.  
    Diagn Interv Imaging. 2023 104 (2023) 263−264 
  •   “In a recent editorial, Microsoft’s president Brad Smith highlights that, like any other technology in history, AI will be used both as a tool and a weapon . In the fields of medicine and research, large language models can be intentionally misused to create scientific sounding papers containing false information, which can easily spread on the internet. As some individuals are advocating for a shift away from traditional peer-review processes in medical journals, we must emphasize that the human peer-review process is more vital and critical than ever. Further, in what may be an arms race, reviewers and editors must make use of tools such as ChatGPT to uncover unethical activities on the part of authors.”
    Beyond chatting: The opportunities and challenges of ChatGPT in medicine and radiology.  
    Ferres JML, Weeks WB, Chu LC, Rowe SP, Fishman EK.  
    Diagn Interv Imaging. 2023 104 (2023) 263−264 
  • “AI will undoubtedly have a profound impact across all aspects of civilization. While we should embrace its benefits, we must also be aware of its potential drawbacks and limitations. With every step forward, there are bound to be missteps, and the road ahead will likely be full of surprises. Technology will continue to challenge, motivate, and occasionally even frighten us, but ultimately it has the potential to enhance our lives in countless ways.”
    Beyond chatting: The opportunities and challenges of ChatGPT in medicine and radiology.  
    Ferres JML, Weeks WB, Chu LC, Rowe SP, Fishman EK.  
    Diagn Interv Imaging. 2023 104 (2023) 263−264
  • “ChatGPT is able to generate coherent research articles, which on initial review may closely resemble authentic articles published by academic researchers. However, all of the articles we assessed were factually inaccurate and had fictitious references. It is worth noting, however, that the articles generated may appear authentic to an untrained reader.’
    A comparison of ChatGPT‑generated articles with human‑written articles
    Sisith Ariyaratne et al
    Skeletal Radiology 2023 (in press)
  • “Our study had several limitations. We used a relatively small sample size, analyzing only 5 articles generated by ChatGPT. We also used version 3.0, which may have certain limitations, including ability to generate accurate information. Further studies analyzing a larger number of articles with more advanced versions of the AI software would ultimately be needed to definitively assess its reliability in generating scientific articles and could be a topic for future research.”
    A comparison of ChatGPT‑generated articles with human‑written articles
    Sisith Ariyaratne et al
    Skeletal Radiology 2023 (in press)
  • “The use of ChatGPT and other related AI technology in nurse education is expected to continue to grow as technology advances and students and educators will become more comfortable with its use. However, it is important to note that while AI technology can enhance teaching and learning, it should not replace human interaction and support. Nurse educators and students should be mindful of the limitations of AI technology and ensure that it is used in conjunction with other teaching methods to provide holistic nurse education.”  
    Chatting or cheating? The impacts of ChatGPT and other artificial intelligence language models on nurse education  
    Edmond Pui Hang Choi et al.
    Nurse Education Today 125 (2023) 105796
  • “While there is certainly potential for ChatGPT to enhance the teaching and learning experience, there are also concerns about its impact on students' critical thinking and clinical reasoning skills. To understand the impacts of ChatGPT on nurse education, more empirical research is needed to investigate: (i) the impact of ChatGPT on student learning outcomes, such as critical thinking, clinical reasoning and knowledge acquisition; (ii) the role of ChatGPT in nurse educators' teaching and how it affects their workload, teaching practices and student engagement; and (iii) the ethical considerations and implications of using ChatGPT in nurse education.”  
    Chatting or cheating? The impacts of ChatGPT and other artificial intelligence language models on nurse education  
    Edmond Pui Hang Choi et al.
    Nurse Education Today 125 (2023) 105796
  •  “Nurse educators should teach students when it is appropriate to use ChatGPT, how to critically appraise the contents generated by it and how to avoid over-reliance on it (Mhlanga, 2023). Nurse educators should help students develop critical and independent thinking skills to evaluate the validity, appropriateness and relevance of the information provided by ChatGPT. For example, they can teach students to consider potential biases and cross-validate information using reputable sources. Nursing students should be encouraged to use multiple sources of information, such as textbooks, academic journals and clinical protocols and guidelines, in addition to ChatGPT as human-driven verification processes are indispensable (van Dis et al., 2023).”  
    Chatting or cheating? The impacts of ChatGPT and other artificial intelligence language models on nurse education  
    Edmond Pui Hang Choi et al.
    Nurse Education Today 125 (2023) 105796
  • “In this cross-sectional study, a chatbot generated quality and empathetic responses to patient questions posed in an online forum. Further exploration of this technology is warranted in clinical settings, such as using chatbot to draft responses that physicians could then edit. Randomized trials could assess further if using AI assistants might improve responses, lower clinician burnout, and improve patient outcomes.”
    Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum
    JohnW. Ayers et al.
    JAMA Intern Med. doi:10.1001/jamainternmed.2023.1838
  • Question Can an artificial intelligence chatbot assistant, provide responses to patient questions that are of comparable quality and empathy to those written by physicians?
    Findings In this cross-sectional study of 195 randomly drawn patient questions from a social media forum, a team of licensed health care professionals compared physician’s and chatbot’s responses to patient’s questions asked publicly on a public social media forum. The chatbot responses were preferred over physician responses and rated significantly higher for both quality and empathy.
    Meaning These results suggest that artificial intelligence assistants may be able to aid in drafting responses to patient questions.
    Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum
    JohnW. Ayers et al.
    JAMA Intern Med. doi:10.1001/jamainternmed.2023.1838
  • “ChatGPT10 represents a new generation of AI technologies driven by advances in large language models. ChatGPT reached 100 million users within 64 days of its November 30, 2022 release and is widely recognized for its ability to write near-human-quality text on a wide range of topics. The system was not developed to provide health care, and its ability to help address patient questions is unexplored. We tested v ability to respond with high-quality and empathetic answers to patients’ health care questions, by comparing the chatbot responses with physicians’ responses to questions posted on a public social media forum.”
    Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum
    JohnW. Ayers et al.
    JAMA Intern Med. doi:10.1001/jamainternmed.2023.1838
  • “While this cross-sectional study has demonstrated promising results in the use of AI assistants for patient questions, it is crucial to note that further research is necessary before any definitive conclusions can be made regarding their potential effect in clinical settings. Despite the limitations of this study and the frequent overhyping of new technologies, studying the addition of AI assistants to patient messaging workflows holds promise with the potential to improve both clinician and patient outcomes.”
    Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum
    JohnW. Ayers et al.
    JAMA Intern Med. doi:10.1001/jamainternmed.2023.1838

  • Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum
    JohnW. Ayers et al.
    JAMA Intern Med. doi:10.1001/jamainternmed.2023.1838
  • “We should be clear-eyed about the risks inherent to any new technology, especially one that carries existential implications. And yet, I am cautiously optimistic about a future of improved health care system efficiency, better patient outcomes, and reduced burnout; a future where AI enables us to get back to the reason why we decided to pursue medicine in the first place—to get up from the computer and back to the bedside.”
    Medicine in the Era of Artificial Intelligence Hey Chatbot,Write Me an H&P
    TevaD. Brender,
    JAMA Internal Medicine Published online April 28, 2023
  • “However, my excitement is tempered by a healthy dose of skepticism. For instance, consider the example of more analog technology. Despite their initial promise, the effect of medical scribes on health care quality, patient satisfaction, and physician productivity and burnout has been decidedly mixed. One might counter that, leveraging the power of big data, AI’s potential is limitless. Nevertheless, we should remain open to the eventuality that, like medical scribes, AI will similarly underdeliver, or that its implementation in health care might be slower and the initial use cases more circumscribed than the proponents hope.”
    Medicine in the Era of Artificial Intelligence Hey Chatbot,Write Me an H&P
    TevaD. Brender,
    JAMA Internal Medicine Published online April 28, 2023
  • “Finally, these programs are not sentient, they simply use massive amounts of text to predict one word after another, and their outputs may mix truth with patently false statements called hallucinations. As such, physicians will need to learn how to integrate these tools into clinical practice, defining clear boundaries between full, supervised, and proscribed autonomy. Just as I do not routinely count the little boxes when determining a heart rate, instead trusting the computer-generated electrocardiogram report, I always meticulously scrutinize the waveform before activating the catheterization laboratory for an ST-elevation myocardial infarction.
    Medicine in the Era of Artificial Intelligence Hey Chatbot, Write Me an H&P
    Teva D. Brender
    JAMA Internal Medicine Published online April 28, 2023
  • “A generative pretrained transformer(GPT) is an AI tool that produces text resembling human writing ,allowing users to interact with AI almost as if they are communicating with another person. The sudden rise in popularity of LLMs was driven largely by GPT-3, OpenAI’s third iteration, which was called the fastest growing app of all time and the most innovative LLM. People use GPT by entering prompts—text instructions in the form of questions or commands. Creating effective AI prompts is an art as much as a science, and the possibilities seem endless. One can use GPT like a search engine. However, GPT’s predictive algorithms can also answer questions that have never been posed.”
    AI-Generated Medical Advice-GPT and Beyond.  
    Haupt CE, Marks M.
    JAMA. 2023 Apr 25;329(16):1349-1350.
  • “For clinicians, GPT can potentially ease burnout by taking on repetitive tasks. MIT could provide clinical decision support and be incorporated into electronic medical record platforms like Epic. GPT might augment or replace frequently used resources like UpToDate. In theory, physicians could enter patient information into the software and ask for a differential diagnosis or preliminary treatment plan. However, current versions of GPT are not HIPAA compliant and could jeopardize patient privacy. Until professional grade versions with adequate safeguards are available, clinicians should avoid inputting protected health information.”
    AI-Generated Medical Advice-GPT and Beyond.  
    Haupt CE, Marks M.
    JAMA. 2023 Apr 25;329(16):1349-1350.
  • “With respect to AI-generated medical advice, as with other innovations, we suggest focusing on relevant social relationships and how the technology affects them. If clinicians use LLMs to aid decision- making, they function like other medical resources or tools. However, using AI to replace human judgment poses safety risks to patients and may expose clinicians to legal liability. Until its accuracy and reliability are proven, GPT should not replace clinician judgment. Although clinicians are not responsible for harms caused by consumer-facing Olmsted should educate patients about the risks. They might also advocate for FTC regulation that protects patients from false or misleading AI-generated medical advice.”
    AI-Generated Medical Advice-GPT and Beyond.  
    Haupt CE, Marks M.
    JAMA. 2023 Apr 25;329(16):1349-1350.
  • “When reliable LLMs do surface, they may well be found among specialized systems rather than generalist systems like ChatGPT. The problem of nontransparent and indiscriminate information sourcing is tractable, and market innovations are already emerging as companies develop LLM products specifically for clinical settings. These models focus on narrower tasks than systems like ChatGPT, making validation easier to perform. Specialized systems can vet LLM outputs against source articles for hallucination, train on electronic health records, or integrate traditional elements of clinical decision support software. Some medical informatics researchers are more sanguine than others about the prospects for specialized systems to outperform generalist models. As evidence continues to emerge, medical informatics researchers will have an important role to play in helping physicians understand the current situation of the specialized systems.”
    ChatGPT and Physicians’ Malpractice Risk
    Michelle M. Mello, JD, PhD, MPhil; Neel Guha, MS
    JAMA Health Forum. 2023;4(5):e231938. 
  • “At their current stage, LLMs have a tendency to generate factually incorrect outputs (called hallucination). The potential to mislead physicians is magnified by the fact that most LLMs source information no transparently. Typically, no list of references is provided by which a physician may evaluate the reliability of the information used to generate the output. When references are given, they are often insufficient or unsupportive of the generated output (if not entirely fabricated).”  
    ChatGPT and Physicians’ Malpractice Risk
    Michelle M. Mello, JD, PhD, MPhil; Neel Guha, MS
    JAMA Health Forum. 2023;4(5):e231938. 
  • “ChatGPT has exploded into the national consciousness. The potential for large language models (LLMs) such as ChatGPT, Bard, and many others to support or replace humans in a range of areas is now clear—and medical decisions are no exception.1 This has sharpened a perennial medicolegal question: How can physicians incorporate promising new technologies into their practice without increasing liability risk? The answer lawyers often give is that physicians should use LLMs to augment, not replace, their professional judgment.2 Physicians might be forgiven for finding such advice unhelpful. No competent physician would blindly follow model output. But what exactly does it mean to augment clinical judgment in a legally defensible fashion?”
    ChatGPT and Physicians’ Malpractice Risk
    Michelle M. Mello, JD, PhD, MPhil; Neel Guha, MS
    JAMA Health Forum. 2023;4(5):e231938. 
  • “ChatGPT, for example, was trained on a great deal of data from the Internet. As the quality of responses depends on the quality of the training data, ChatGPT at times produces unsettling responses. It also “hallucinates” by providing answers that are not derivable from the training data. Importantly, GAI models can be iteratively “fine-tuned” with additional domain-specific training data. Applied appropriately, fine-tuning can produce specialized models that are more accurate and predictable and thus safer for specific use cases.”  
    Getting Generative AI Right
    Samuel Aronson,Ted W. Lieu, Benjamin M. Scirica
    NEJM April 2023 DOI: 10.1056/CAT.23.0063
  • “The Centers for Medicare & Medicaid Services (CMS) should establish reimbursement policies to incentivize the use of time-saving, validated GAI capabilities. The CMS Innovation Center (CMMI) could be used to expedite the testing of new GAI reimbursement models before these models are broadly rolled out. Many billing codes tie compensation to a certain number of minutes per case rather than to quality and efficiency. As a counterbalance, billing codes for use of GAI in specific circumstances could create a powerful incentive for private financing of GAI capabilities. CMS also could consider sponsoring a new GAI competition.”  
    Getting Generative AI Right
    Samuel Aronson,Ted W. Lieu, Benjamin M. Scirica
    NEJM April 2023 DOI: 10.1056/CAT.23.0063
  • Us: Will governments, clinicians, and technologists be supportive?  
    ChatGPT: There is increasing recognition among clinicians, technologists, and governments of the potential benefits of deploying generative AI to assist with administrative and documentation tasks in health care, and of the need to focus on more complex clinical decision support uses. However, there may also be concerns around ethical data use, patient privacy, and the need for ongoing education and training to ensure that this technology is used effectively and responsibly. As such, it will be important for stakeholders to work together to address these challenges and support the responsible deployment of generative AI in health care.  
    Getting Generative AI Right
    Samuel Aronson,Ted W. Lieu, Benjamin M. Scirica
    NEJM April 2023 DOI: 10.1056/CAT.23.0063
  • AI in Clinical Practice: Barriers
    Reimbursement challenges
    Physician over reliance on AI
    Potential issues with bias in AI algorithms (usually unintentional)
    Liability issues
    Patient comfort with AI assisted medicine
    Black box nature of AI
  • “A chatbot consists of two main components: a general-purpose AI system and a chat interface. This article considers specifically an AI system called GPT-4 (Generative Pretrained Transformer 4) with a chat interface; this system is widely available and in active development by OpenAI, an AI research and deployment company. To use a chatbot, one starts a “session” by entering a query — usually referred to as a “prompt” — in plain natural language. Typically, but not always, the user is a human being. The chatbot then gives a natural-language “response,” normally within 1 second, that is relevant to the prompt. This exchange of prompts and responses continues throughout the session, and the overall effect is very much like a conversation between two people”
    Benefits, Limits, and Risks of GPT-4as an AI Chatbot for Medicine
    Peter Lee, Ph.D., Sebastien Bubeck, Ph.D., and Joseph Petro, M.S.
    n engl j med 388;13 nejm.org March 30, 2023
  • “A false response by GPT-4 is sometimes referred to as a “hallucination,”6 and such errors can be particularly dangerous in medical scenarios because the errors or falsehoods can be subtle and are often stated by the chatbot in such a convincing manner that the person making the query may be convinced of its veracity. It is thus important to check or verify the output of GPT-4.”
    Benefits, Limits, and Risks of GPT-4as an AI Chatbot for Medicine
    Peter Lee, Sebastien Bubeck, Joseph Petro
    n engl j med 388;13 nejm.org March 30, 2023
  • “GPT-4 was not programmed for a specific “assigned task” such as reading images or analyzing medical notes. Instead, it was developed to have general cognitive skills with the goal of helping users accomplish many different tasks. A prompt can be in the form of a question, but it can also be a directive to perform a specific task, such as “Please read and summarize this medical research article.” Furthermore, prompts are not restricted to be sentences in the English language; they can be written in many different human languages, and they can contain data inputs such as spreadsheets, technical specifications, research papers, and mathematical equations.”
    Benefits, Limits, and Risks of GPT-4as an AI Chatbot for Medicine
    Peter Lee, Sebastien Bubeck, Joseph Petro
    n engl j med 388;13 nejm.org March 30, 2023
  • Even though GPT-4 was trained only on openly available information on the Internet, when it is given a battery of test questions from the USMLE,11 it answers correctly more than 90% of the time. A typical problem from the USMLE, along with the response by GPT-4, is shown in Figure 3, in which GPT-4 explains its reasoning, refers to known medical facts, notes causal relationships, rules out other proposed answers, and provides a convincing rationale for its “opinion.”
    Benefits, Limits, and Risks of GPT-4as an AI Chatbot for Medicine
    Peter Lee, Sebastien Bubeck, Joseph Petro
    n engl j med 388;13 nejm.org March 30, 2023
  • “This knowledge of medicine makes GPT-4 potentially useful not only in clinical settings but also in research. GPT-4 can read medical research material and engage in informed discussion about it, such as briefly summarizing the content, providing technical analysis, identifying relevant prior work, assessing the conclusions, and asking possible follow-up research questions.”
    Benefits, Limits, and Risks of GPT-4as an AI Chatbot for Medicine
    Peter Lee, Sebastien Bubeck, Joseph Petro
    n engl j med 388;13 nejm.org March 30, 2023
  • “Perhaps the most important point is that GPT-4 is not an end in and of itself.  It is the opening of a door to new possibilities as well as new risks. We speculate that GPT-4 will soon be followed by even more powerful and capable AI systems — a series of increasingly powerful and generally intelligent machines. These machines are tools, and like all tools, they can be used for good but have the potential to cause harm. If used carefully and with an appropriate degree of caution, these evolving tools have the potential to help health care providers give the best care possible.”
    Benefits, Limits, and Risks of GPT-4as an AI Chatbot for Medicine
    Peter Lee, Sebastien Bubeck, Joseph Petro
    n engl j med 388;13 nejm.org March 30, 2023
  • “It is important to understand that this is a fast-moving field, so to some extent, what we publish may have the resolution of a snapshot of the landscape taken from a bullet train. Specifically, things happening in close temporal proximity to publication may be blurred because they are changing quickly, but the distant background will be in reasonably good focus. ”
    Artificial Intelligence and Machine Learning in Clinical Medicine, 2023  
    Charlotte J. Haug, Jeffrey M. Drazen  
    N Engl J Med 2023;388:1201-8.  
  • “A chatbot is a computer program that uses AI and natural-language processing to understand questions and automate responses to them, simulating human conversation. A very early medical chatbot, ELIZA, was developed between 1964 and 1966 by Joseph Weizenbaum at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology.”
    Artificial Intelligence and Machine Learning in Clinical Medicine, 2023  
    Charlotte J. Haug, Jeffrey M. Drazen  
    N Engl J Med 2023;388:1201-8. 
  • “The application of greatest potential and concern is the use of chatbots to make diagnoses or recommend treatment. A user without clinical experience could have trouble differentiating fact from fiction. Both these issues are addressed in the article by Lee and colleagues, who point out the strengths and weaknesses of using chatbots in medicine. Since the authors have created one such entity, bias is likely. Nevertheless, we think that chatbots will become important tools in the practice of medicine. Like any good tool, they can help us do our job better, but if not used properly, they have the potential to do damage. Since the tools are new and hard to test with the use of the traditional methods noted above, the medical community will be learning how to use them, but learn we must. There is no question that the chatbots will also learn from their users.”
    Artificial Intelligence and Machine Learning in Clinical Medicine, 2023  
    Charlotte J. Haug, Jeffrey M. Drazen
     N Engl J Med 2023;388:1201-8. 
  • “Nevertheless, we think that chatbots will become important tools in the practice of medicine. Like any good tool, they can help us do our job better, but if not used properly, they have the potential to do damage. Since the tools are new and hard to test with the use of the traditional methods noted above, the medical community will be learning how to use them, but learn we must. There is no question that the chatbots will also learn from their users.”
    Artificial Intelligence and Machine Learning in Clinical Medicine, 2023  
    Charlotte J. Haug, Jeffrey M. Drazen  
    N Engl J Med 2023;388:1201-8. 
  • “We firmly believe that the introduction of AI and machine learning in medicine has helped health professionals improve the quality of care that they can deliver and has the promise to improve it even more in the near future and beyond. Just as computer acquisition of radiographic images did away with the x-ray file room and lost images, AI and machine learning can transform medicine. Health professionals will figure out how to work with AI and machine learning as we grow along with the technology. AI and machine learning will not put health professionals out of business; rather, they will make it possible for health professionals to do their jobs better and leave time for the human–human interactions that make medicine the rewarding profession we all value.”
    Artificial Intelligence and Machine Learning in Clinical Medicine, 2023  
    Charlotte J. Haug, Jeffrey M. Drazen  
    N Engl J Med 2023;388:1201-8. 
  • “Artificial intelligence (AI) technologies to help authors improve the preparation and quality of their manuscripts and published articles are rapidly increasing in number and sophistication. These include tools to assist with writing, grammar, language, references, statistical analysis, and reporting standards. Editors and publishers also use AI-assisted tools for myriad purposes, including to screen submissions for problems (eg, plagiarism, image manipulation, ethical issues), triage submissions, validate references, edit, and code content for publication in different media and to facilitate postpublication search and discoverability.”
    Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge.  
    Flanagin A, Bibbins-Domingo K, Berkwits M, Christiansen SL.
    JAMA. Published online January 31, 2023. doi:10.1001/jama.2023.1344
  • “In November 2022, OpenAI released a new open source, natural language processing tool called ChatGPT. ChatGPT is an evolution of a chatbot that is designed to simulate human conversation in response to prompts or questions (GPT stands for “generative pretrained transformer”). The release has prompted immediate excitement about its many potential uses4 but also trepidation about potential misuse, such as concerns about using the language model to cheat on homework assignments, write student essays, and take examinations, including medical licensing examinations.In January 2023, Nature reported on 2 preprints and 2 articles published in the science and health fields that included ChatGPT as a bylined author.Each of these includes an affiliation for ChatGPT, and 1 of the articles includes an email address for the nonhuman “author.” According to Nature, that article’s inclusion of ChatGPT in the author byline was an “error that will soon be corrected.”However, these articles and their nonhuman “authors” have already been indexed in PubMed and Google Scholar.”
    Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge.  
    Flanagin A, Bibbins-Domingo K, Berkwits M, Christiansen SL.
    JAMA. Published online January 31, 2023. doi:10.1001/jama.2023.1344
  • Nonhuman artificial intelligence, language models, machine learning, or similar technologies do not qualify for authorship. If these models or tools are used to create content or assist with writing or manuscript preparation, authors must take responsibility for the integrity of the content generated by these tools. Authors should report the use of artificial intelligence, language models, machine learning, or similar technologies to create content or assist with writing or editing of manuscripts in the Acknowledgment section or the Methods section if this is part of formal research design or methods. This should include a description of the content that was created or edited and the name of the language model or tool, version and extension numbers, and manufacturer. (Note: this does not include basic tools for checking grammar, spelling, references, etc.)
    Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge.  
    Flanagin A, Bibbins-Domingo K, Berkwits M, Christiansen SL.
    JAMA. Published online January 31, 2023. doi:10.1001/jama.2023.1344
  • “Transformative, disruptive technologies, like AI language models, create promise and opportunities as well as risks and threats for all involved in the scientific enterprise. Calls for journals to implement screening for AI-generated content will likely escalate,especially for journals that have been targets of paper mills and other unscrupulous or fraudulent practices. But with large investments in further development, AI tools may be capable of evading any such screens. Regardless, AI technologies have existed for some time, will be further and faster developed, and will continue to be used in all stages of research and the dissemination of information, hopefully with innovative advances that offset any perils. In this era of pervasive misinformation and mistrust, responsible use of AI language models and transparent reporting of how these tools are used in the creation of information and publication are vital to promote and protect the credibility and integrity of medical research and trust in medical knowledge.”
    Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge.  
    Flanagin A, Bibbins-Domingo K, Berkwits M, Christiansen SL.
    JAMA. Published online January 31, 2023. doi:10.1001/jama.2023.1344
  • “This exploratory study found that a popular online AI model provided largely appropriate responses to simple CVD prevention questions as evaluated by preventive cardiology clinicians. Findings suggest the potential of interactive AI to assist clinical workflows by augmenting patient education and patient-clinician communication around common CVD prevention queries. For example, such an application may provide conversational responses to simple queries on informational platforms or create automated draft responses to patient electronic messages for clinicians. Whether these approaches can improve readability should be explored, because prior work has indicated low readability of certain online patient educational materials for CVD prevention.”
    Appropriateness of Cardiovascular Disease Prevention Recommendations Obtained From a Popular Online Chat-Based Artificial Intelligence Model.  
    Sarraju A, Bruemmer D, Van Iterson E, Cho L, Rodriguez F, Laffin L.  
    JAMA. Published online February 03, 2023. doi:10.1001/jama.2023.1044
  • “AI model responses to 21 of 25 questions (84%) were graded as appropriate in both contexts (Table). Four responses (16%) were graded as inappropriate in both contexts. For 3 of the 4 sets of responses, all 3 responses had inappropriate information; for 1 set, 1 of 3 responses was inappropriate. For example, the AI model responded to questions about exercise by firmly recommending both cardiovascular activity and lifting weights, which may be incorrect and potentially harmful for certain patients. Responses about interpreting a low-density lipoprotein cholesterol level of 200 mg/dL lacked relevant details, including familial hypercholesterolemia and genetic considerations. Responses about inclisiran suggested that it is commercially unavailable. No responses were graded as unreliable.”
    Appropriateness of Cardiovascular Disease Prevention Recommendations Obtained From a Popular Online Chat-Based Artificial Intelligence Model.  
    Sarraju A, Bruemmer D, Van Iterson E, Cho L, Rodriguez F, Laffin L.  
    JAMA. Published online February 03, 2023. doi:10.1001/jama.2023.1044

Privacy Policy

Copyright © 2024 The Johns Hopkins University, The Johns Hopkins Hospital, and The Johns Hopkins Health System Corporation. All rights reserved.