Skip to content
  • About
  • Contact
  • Contribute
  • Book
  • Careers
  • Podcast
  • Recommended
  • Speaking
KevinMD
  • All
  • Physician
  • Practice
  • Policy
  • Finance
  • Conditions
  • .edu
  • Patient
  • Meds
  • Tech
  • Social
  • Video
  • All
  • Physician
  • Practice
  • Policy
  • Finance
  • Conditions
  • .edu
  • Patient
  • Meds
  • Tech
  • Social
  • Video
    • All
    • Physician
    • Practice
    • Policy
    • Finance
    • Conditions
    • .edu
    • Patient
    • Meds
    • Tech
    • Social
    • Video
    • About
    • Contact
    • Contribute
    • Book
    • Careers
    • Podcast
    • Recommended
    • Speaking
KevinMD
  • All
  • Physician
  • Practice
  • Policy
  • Finance
  • Conditions
  • .edu
  • Patient
  • Meds
  • Tech
  • Social
  • Video
    • All
    • Physician
    • Practice
    • Policy
    • Finance
    • Conditions
    • .edu
    • Patient
    • Meds
    • Tech
    • Social
    • Video
    • About
    • Contact
    • Contribute
    • Book
    • Careers
    • Podcast
    • Recommended
    • Speaking
  • About KevinMD | Kevin Pho, MD
  • Be heard on social media’s leading physician voice
  • Contact Kevin
  • Discounted enhanced author page
  • DMCA Policy
  • Establishing, Managing, and Protecting Your Online Reputation: A Social Media Guide for Physicians and Medical Practices
  • Group vs. individual disability insurance for doctors: pros and cons
  • KevinMD influencer opportunities
  • Opinion and commentary by KevinMD
  • Physician burnout speakers to keynote your conference
  • Physician Coaching by KevinMD
  • Physician keynote speaker: Kevin Pho, MD
  • Physician Speaking by KevinMD: a boutique speakers bureau
  • Primary care physician in Nashua, NH | Kevin Pho, MD
  • Privacy Policy
  • Recommended services by KevinMD
  • Terms of Use Agreement
  • Thank you for subscribing to KevinMD
  • Thank you for upgrading to the KevinMD enhanced author page
  • The biggest mistake doctors make when purchasing disability insurance
  • The doctor’s guide to disability insurance: short-term vs. long-term
  • The KevinMD ToolKit
  • Upgrade to the KevinMD enhanced author page
  • Why own-occupation disability insurance is a must for doctors

The limits of large language models in clinical practice

Edward G. Rogoff and Alena Ivashenka, PhD
Tech
May 2, 2026
Share
Tweet
Share

Artificial intelligence is no longer a distant concept in modern medicine. It is already entering clinical workflows through tools that draft patient notes, summarize charts, generate patient education materials, and assist with decision-making. At the center of this shift are large language models such as ChatGPT, Med-PaLM, and other health care-adapted systems. For clinicians, the most important question is not whether these tools are impressive. It is whether they understand what large language models actually do, where large language models help, and where large language models can mislead them.

That question matters because artificial intelligence is arriving at a particularly difficult moment in medicine. Patient demand continues to rise. Workforce shortages persist. Care is becoming more complex. Administrative burden keeps expanding. Burnout is no longer an abstract concern; it is part of the daily reality of clinical practice. In that environment, artificial intelligence is not entering health care as a novelty. It is being introduced as a possible response to a system already under significant strain. The real issue is not whether artificial intelligence will replace physicians. It will not. The issue is whether it can meaningfully reduce some of the pressure that is making modern medicine harder to sustain.

How large language models work

Large language models are often described as intelligent, but that description can be misleading. They are not clinical reasoning engines. They are highly advanced language prediction systems designed to generate the next most likely word or phrase based on the text that comes before it. Large language models are trained on enormous datasets and become very good at recognizing patterns, relationships, and structure in language.

This distinction is critical. Large language models can produce responses that sound fluent, polished, and clinically plausible. But fluency is not the same as understanding. These systems do not know pathophysiology. They do not think through uncertainty the way a clinician does. They do not build a differential diagnosis from first principles or apply judgment in the human sense. Their strength lies in pattern recognition at a vast scale, not true reasoning.

Large language model development typically happens in stages. First is pretraining on vast amounts of text, which may include medical literature, guidelines, educational resources, and other written material. This gives the model broad familiarity with language and domain-specific terminology. Next comes fine-tuning on narrower health care datasets, such as clinical notes, radiology reports, discharge summaries, or patient educational content. Some systems are further shaped through reinforcement learning from human feedback, in which clinicians or reviewers evaluate outputs and steer the model toward safer or more useful responses.

This process of training improves performance, but it does not eliminate risk. These models inherit both the strengths and the weaknesses of the data used to train them. If the source material is inconsistent, biased, outdated, or inaccurate, the model will reflect those problems. And because the practice of medicine changes quickly, even a well-performing model can become out-of-date unless it is regularly updated or paired with tools that retrieve and provide current information.

Data integrity and limitations in large language model training

Anyone who works in medicine understands how messy clinical documentation can be. Notes are filled with abbreviations, shorthand, copied-forward text, inconsistent formatting, outdated details, and fragmented narratives. Those same characteristics become a problem when these mistakes are absorbed into training datasets. The model learns not only medical language, but also those mistakes embedded within it.

The issue extends beyond messy notes. Training data may also include outdated recommendations, incorrect diagnoses, low-quality educational sources, demographically narrow studies, or content that has not been rigorously validated. Some datasets underrepresent important populations, including racial and ethnic minorities, pediatric and geriatric patients, low-income communities, and people with rare diseases. As a result, model performance may not be equally reliable across patient groups. In some cases, these tools risk reinforcing the very inequities medicine is already trying to address.

Even the technical side of data management can introduce problems. Duplicated encounters, coding errors, mislabeled images, and synthetic examples that do not reflect real-world care can all distort the model’s output. When that happens, the result may be an answer that appears coherent but is fundamentally wrong.

Clinical limitations of large language models

This leads to one of the best-known limitations of large language models: hallucination. These systems can generate false information with complete confidence. They may invent citations, fabricate guideline recommendations, misstate pathophysiology, or produce inaccurate clinical summaries that sound entirely reasonable. In a busy clinical environment, this is a serious risk. The smoother the language, the easier it is to miss the mistake.

More fundamentally, large language models do not exercise clinical judgment. They do not perform true Bayesian reasoning. Large language models do not recognize when a patient presentation is evolving away from the expected pattern. They do not cope with uncertainty, challenge assumptions, or change course when new facts are introduced. They cannot synthesize medical facts with ethics, family dynamics, psychosocial context, and bedside nuance the way experienced clinicians do every day.

Large language models also lack real-time awareness unless they are tightly integrated with the clinical environment. Without access to current vital signs, laboratory results, imaging, medication changes, nursing observations, and social context, they are generating language based on incomplete information. Even when they are integrated into health systems, large language models may still misinterpret conflicting or partial data.

Another major limitation is explainability. Large language models can produce an answer, but they cannot provide a transparent, auditable account of how they arrived there in a way that satisfies the standards of peer review, legal scrutiny, or formal clinical justification. In medicine, where documentation and accountability matter, that is a major limitation. The regulatory environment only adds to the uncertainty. The legal and policy framework for clinical artificial intelligence is still evolving. Questions remain about liability, documentation of artificial intelligence-assisted decisions, privacy protections, Food and Drug Administration oversight, and the degree to which clinicians can safely rely on machine-generated recommendations. At present, the clearest principle is that clinician oversight must remain central. Artificial intelligence may assist, but it cannot be the final decision-maker.

Where humans remain central

Medicine is not just an exercise in information retrieval. Clinical care requires interpretation, prioritization, communication, and accountability. Physicians and other clinicians integrate subtle findings across multiple domains, weigh competing possibilities, interpret uncertainty, and adapt decisions to the patient in front of them. They account for psychosocial realities, family concerns, values, culture, and goals of care. They make judgments that go beyond the words in a chart.

These strengths remain deeply human. Medicine is relational. Trust, empathy, shared decision-making, and therapeutic presence are not secondary to care; they are part of care itself. A patient’s willingness to disclose symptoms, follow recommendations, or navigate a serious illness often depends on the quality of that relationship. No language model can replace that. The growing presence of technology in medicine makes the human side of care even more important.

The emerging clinical value of large language models

At the same time, dismissing artificial intelligence and large language models would be a mistake. While large language models are primarily language tools, the broader ecosystem of clinical artificial intelligence is already demonstrating meaningful value in practice, especially in areas marked by information overload and time pressure. In radiology, emergency medicine, and hospital care, artificial intelligence systems are increasingly used to flag acute findings such as stroke, pulmonary embolism, and intracranial hemorrhage, which helps clinicians prioritize critical cases more quickly. In pathology and oncology, artificial intelligence can identify subtle patterns in tissue and imaging data, improving consistency and supporting faster workflows. These systems do not replace expertise. They help clinicians apply expertise more efficiently.

Artificial intelligence is also beginning to reshape the front end of care. Symptom assessment tools and virtual triage platforms may help patients navigate the system more effectively, potentially reducing unnecessary visits and improving alignment between patient needs and the care setting they choose. Predictive analytics are also creating opportunities for earlier identification of risk and more personalized treatment strategies by integrating clinical, genomic, and longitudinal data. Taken together, these developments suggest that artificial intelligence’s greatest contribution may not be replacing decision-making but improving how quickly and effectively clinicians can process complex information and act on it.

For now, the safest and most appropriate role for artificial intelligence is using large language models in clinical practice in structured, lower-risk support tasks. They can be useful for drafting histories and physicals, discharge summaries, referral letters, and patient instructions. They can help summarize long records, organize information, support coding when verified, and assist with literature synthesis and educational material. What they should not do is function autonomously in high-stakes clinical decisions. They should not independently diagnose new conditions, recommend treatment changes, interpret imaging or electrocardiograms without oversight, make triage decisions, or manage unstable patients. These are not just technical tasks. They require human judgment.

Safe use of these tools requires discipline. Clinicians should review and edit every output. They should use only institution-approved, privacy-compliant systems. Artificial intelligence-generated language should be treated as a draft rather than a conclusion. Most importantly, clinicians should document and rely on their own reasoning rather than simply inheriting the model’s phrasing. Large language models do offer real value in medicine, particularly for documentation, summarization, and information synthesis. But they are not substitutes for clinical reasoning. They do not understand disease, do not think causally, and do not carry responsibility.

Still, their arrival is not happening in a vacuum. They are entering a health care system weighed down by administrative overload, cognitive burden, and workforce strain. If implemented thoughtfully, artificial intelligence may help reduce some of that pressure. It may improve efficiency, support faster recognition of important information, and return time to clinicians. All of this matters, because the most valuable elements of medical care are the principles that technology cannot replace: thinking carefully, connecting with patients, exercising judgment, and delivering humane, high-quality care. The future of artificial intelligence in medicine should not be framed as physicians versus technology. It should be framed as physicians being supported by technology that can genuinely help. At its best, artificial intelligence does not diminish the clinician’s role. It supports and protects it.

Edward G. Rogoff is a professor of entrepreneurship and a patient advocate. Alena Ivashenka is a biotechnology and life sciences investment expert.

Prev

AI is already reading your dental X-rays and you probably have no idea [PODCAST]

May 1, 2026 Kevin 0
…

Kevin

Tagged as: Health IT

< Previous Post
AI is already reading your dental X-rays and you probably have no idea [PODCAST]

ADVERTISEMENT

Related Posts

  • Health insurers are ignoring price transparency rules at the expense of private practice

    Nathaniel Arana
  • Why building your social media following is critical to your practice’s success

    Sheila Nazarian, MD
  • How drivers of health screenings led to immediate patient impact and practice sustainability

    Ripley Hollister, MD
  • Why doctors must fight health misinformation on social media

    Olapeju Simoyan, MD
  • Advocating for people with disabilities: People First Language

    Leonard Wang
  • How language shapes physician migration and medical training

    Omer Ahmed

More in Tech

  • Artificial intelligence in residency education and family medicine

    Jyothi Ranga Patri, MD, MHA
  • Transforming nursing education with immersive technology

    Kelly J. Dries, PhD, RN
  • 4 questions to ask about enterprise AI drug dosing

    Amanda Heidemann, MD
  • Overcoming the fear of health care AI in data abstraction

    Brandy Sue Greif, LPN
  • The urgent need for AI mental health regulation after Tumbler Ridge

    Sophie Nunnelley, JD
  • Why accountability in medicine must guide health care AI

    Ian Hu and Pao Hsuan Huang
  • Most Popular

  • Past Week

    • How corporate health care ruined the medical profession

      Edmond Cabbabe, MD | Physician
    • 13.1 reasons running a half marathon beats practicing medicine

      John Wei, MD | Physician
    • The cost of chaos in medical malpractice litigation

      Howard Smith, MD | Physician
    • Why our health care system is failing chronic disease patients

      Beata Pasek, EdD | Conditions
    • Medicare practice expense cuts will hurt patients

      John Birkmeyer, MD | Policy
    • The limits of large language models in clinical practice

      Edward G. Rogoff and Alena Ivashenka, PhD | Tech
  • Past 6 Months

    • Why clinicians fail at writing expert reports

      Tracy Liberatore, Esq, PA | Conditions
    • Rethinking the role of family physicians vs. specialists

      Ronald L. Lindsay, MD | Physician
    • How corporate health care ruined the medical profession

      Edmond Cabbabe, MD | Physician
    • Clinicians are failing at value-based care because no one taught them the system [PODCAST]

      The Podcast by KevinMD | Podcast
    • Why clinical listening skills outpace artificial intelligence

      Ryan Egeland, MD, PhD | Tech
    • Administrative burden is driving severe physician burnout

      Kayvan Haddadan, MD | Physician
  • Recent Posts

    • The limits of large language models in clinical practice

      Edward G. Rogoff and Alena Ivashenka, PhD | Tech
    • AI is already reading your dental X-rays and you probably have no idea [PODCAST]

      The Podcast by KevinMD | Podcast
    • The memory of water and a historic scientific controversy

      Rao M. Uppu, PhD | Conditions
    • Why nursing home regulations must address mental illness

      Amanda M. Buster and J. Wesley Boyd, MD, PhD | Conditions
    • The crash cart that taught me physician-led investing

      Harsha Moole, MD | Finance
    • Aging care is not about fixing every wrong note

      Gerald Kuo | Conditions

Subscribe to KevinMD and never miss a story!

Get free updates delivered free to your inbox.


Find jobs at
Careers by KevinMD.com

Search thousands of physician, PA, NP, and CRNA jobs now.

Learn more

Leave a Comment

Founded in 2004 by Kevin Pho, MD, KevinMD.com is the web’s leading platform where physicians, advanced practitioners, nurses, medical students, and patients share their insight and tell their stories.

Social

  • Like on Facebook
  • Follow on Twitter
  • Connect on Linkedin
  • Subscribe on Youtube
  • Instagram

ADVERTISEMENT

  • Most Popular

  • Past Week

    • How corporate health care ruined the medical profession

      Edmond Cabbabe, MD | Physician
    • 13.1 reasons running a half marathon beats practicing medicine

      John Wei, MD | Physician
    • The cost of chaos in medical malpractice litigation

      Howard Smith, MD | Physician
    • Why our health care system is failing chronic disease patients

      Beata Pasek, EdD | Conditions
    • Medicare practice expense cuts will hurt patients

      John Birkmeyer, MD | Policy
    • The limits of large language models in clinical practice

      Edward G. Rogoff and Alena Ivashenka, PhD | Tech
  • Past 6 Months

    • Why clinicians fail at writing expert reports

      Tracy Liberatore, Esq, PA | Conditions
    • Rethinking the role of family physicians vs. specialists

      Ronald L. Lindsay, MD | Physician
    • How corporate health care ruined the medical profession

      Edmond Cabbabe, MD | Physician
    • Clinicians are failing at value-based care because no one taught them the system [PODCAST]

      The Podcast by KevinMD | Podcast
    • Why clinical listening skills outpace artificial intelligence

      Ryan Egeland, MD, PhD | Tech
    • Administrative burden is driving severe physician burnout

      Kayvan Haddadan, MD | Physician
  • Recent Posts

    • The limits of large language models in clinical practice

      Edward G. Rogoff and Alena Ivashenka, PhD | Tech
    • AI is already reading your dental X-rays and you probably have no idea [PODCAST]

      The Podcast by KevinMD | Podcast
    • The memory of water and a historic scientific controversy

      Rao M. Uppu, PhD | Conditions
    • Why nursing home regulations must address mental illness

      Amanda M. Buster and J. Wesley Boyd, MD, PhD | Conditions
    • The crash cart that taught me physician-led investing

      Harsha Moole, MD | Finance
    • Aging care is not about fixing every wrong note

      Gerald Kuo | Conditions

MedPage Today Professional

An Everyday Health Property Medpage Today

Copyright © 2026 KevinMD.com | Powered by Astra WordPress Theme

  • Terms of Use | Disclaimer
  • Privacy Policy
  • DMCA Policy
All Content © KevinMD, LLC
Site by Outthink Group

Leave a Comment

Comments are moderated before they are published. Please read the comment policy.

Loading Comments...