Skip to content
  • About
  • Contact
  • Contribute
  • My Book
  • Careers
  • Podcast
  • Transcripts
  • Speaking
KevinMD
  • All
  • Physician
  • Burnout
  • Practice
  • Policy
  • Finance
  • Conditions
  • .edu
  • Patient
  • Meds
  • Tech
  • Social
  • All
  • Physician
  • Burnout
  • Practice
  • Policy
  • Finance
  • Conditions
  • .edu
  • Patient
  • Meds
  • Tech
  • Social
    • All
    • Physician
    • Burnout
    • Practice
    • Policy
    • Finance
    • Conditions
    • .edu
    • Patient
    • Meds
    • Tech
    • Social
    • About
    • Contact
    • Contribute
    • My Book
    • Careers
    • Podcast
    • Transcripts
    • Speaking
KevinMD
  • All
  • Physician
  • Burnout
  • Practice
  • Policy
  • Finance
  • Conditions
  • .edu
  • Patient
  • Meds
  • Tech
  • Social
    • All
    • Physician
    • Burnout
    • Practice
    • Policy
    • Finance
    • Conditions
    • .edu
    • Patient
    • Meds
    • Tech
    • Social
    • About
    • Contact
    • Contribute
    • My Book
    • Careers
    • Podcast
    • Transcripts
    • Speaking
  • About Kevin Pho, MD, Founder of KevinMD
  • Be heard on social media’s leading physician voice
  • Contact Kevin
  • Custom enhanced author page pricing
  • DMCA Policy
  • Establishing, Managing, and Protecting Your Online Reputation: A Social Media Guide for Physicians and Medical Practices
  • KevinMD influencer opportunities
  • Opinion and commentary by KevinMD
  • Physician burnout speakers to keynote your conference
  • Physician Coaching by KevinMD
  • Physician keynote speaker: Kevin Pho, MD
  • Physician Speaking by KevinMD: a boutique speakers bureau
  • Primary care physician in Nashua, NH | Kevin Pho, MD
  • Privacy Policy
  • Recommended services by KevinMD
  • Terms of Use Agreement
  • Thank you for subscribing to KevinMD
  • Thank you for upgrading to the KevinMD enhanced author page
  • Upgrade to the KevinMD enhanced author page

Why the safest medical AI knows when not to answer

Timothy Lesaca, MD
Health Technology
June 17, 2026
Share
Tweet
Share

When a recent study in Nature Medicine compared general-purpose AI models with specialized clinical tools, the headline was exactly the kind the technology world likes to see: The generalists came out on top.

The reactions were predictable. Some observers treated the study as proof that specialized clinical software is no longer needed, arguing that bigger and more powerful AI models will always win. Others dismissed the study, pointing out that AI changes so quickly that any published comparison is already out of date by the time it appears. But I believe both reactions miss the deeper issue. The problem is that many current AI evaluations ignore a central part of clinical medicine: knowing when not to answer.

In medicine, staying silent does not always mean you do not know. Pausing before answering is not a weakness. Sometimes, the most responsible thing a clinician can do is hold back. Every practicing physician understands this instinctively. We are trained not only to gather medical facts but also to recognize the limits of the information before us. We look for what is missing.

That is why one of the most interesting measures in the Nature Medicine study was not simply accuracy, but refusal. UpToDate Expert AI reportedly declined to answer 19 percent of real clinical questions, far more often than the general-purpose models. In a standard AI benchmark, this looks like failure. The model did not satisfy the prompt. It left a blank. It seemed less useful. But in real clinical care, saying, “I cannot answer that safely from the information provided,” may be exactly what a patient needs.

In software, refusal is often treated as a problem to solve. In medicine, refusal can be a safety mechanism. A chatbot that always produces a confident answer may appear more intelligent than a clinical tool that pauses, narrows the question, or declines to respond. But patients are not protected by smooth language.

This is where current AI evaluations fall short. They reward answers that sound smooth, complete, and confident. They ask whether the AI answered the question, but they too rarely ask whether the question was safe to answer in the first place. On a spreadsheet, a formatting mistake and a dangerous pediatric drug dose may both appear as “errors.” In real life, they are not remotely equivalent. A minor mistake in a low-stakes explanation of disease physiology is one thing. Inventing a contraindication, overlooking pregnancy, or recommending a medication without adjusting for severe renal impairment can be catastrophic. If our evaluation systems score all errors as if they carry the same clinical consequence, we will incentivize developers to build the wrong tools.

This is not a criticism of general-purpose AI models. Like many physicians, I use AI regularly to help with documentation, summarize long notes, and organize messy clinical narratives. For administrative and cognitive support, these tools can be extremely useful. But clinical decision support is different. It is not just about finding the right fact or producing the most polished explanation. It is about managing risk when information is incomplete.

That is where I think the board-exam mindset of AI benchmarking breaks down. Standardized medical questions are tidy. They usually include the necessary facts and are designed to lead toward a correct answer. Real patients are rarely so easy. A model that performs well when all relevant facts are already present may not be the safest when the most important fact is missing.

Sometimes an AI tool should summarize. Sometimes it should retrieve a guideline. Sometimes it should identify a missing variable, such as creatinine clearance, gestational age, QT interval, culture data, or medication history. And sometimes it should say: This situation is too risky for an automated answer, and specialist input or urgent clinical evaluation is needed. That may frustrate a benchmark. But it often mirrors safe clinical practice.

Medical AI evaluations need to become abstention-aware. Refusal should not automatically be scored as failure. It should be examined. Was the prompt underspecified? Was the clinical risk high? Did the tool identify the missing information? Did it redirect safely? Did it protect the patient from a plausible but unsafe answer?

We also need consequence-weighted scoring. A harmless omission should not be treated the same as a potentially lethal hallucination. A model that refuses appropriately in high-risk situations may be safer than one that answers every question with confidence. If the market keeps rewarding models that never pause, developers will build systems optimized for relentless answer production. Our clinics will end up with tools that sound decisive precisely when medicine requires humility.

Restraint is part of good clinical care. Not prescribing, not reassuring, not discharging, and not answering prematurely are often among the most important decisions a physician makes. The future of medical AI should be shaped by whether our tools learn the same lesson we teach every medical trainee: Answer when you can, ask when you must, and stop when answering would be unsafe.

The safest medical AI may not be the one that always has an answer. It may be the one that knows when not to.

Timothy Lesaca is a psychiatrist in private practice at New Directions Mental Health in Pittsburgh, Pennsylvania, with more than forty years of experience treating children, adolescents, and adults across outpatient, inpatient, and community mental health settings. He has published in peer-reviewed and professional venues including the Patient Experience Journal, Psychiatric Times, the Allegheny County Medical Society Bulletin, and other clinical journals, with work addressing topics such as open-access scheduling, Landau-Kleffner syndrome, physician suicide, and the dynamics of contemporary medical practice. His recent writing examines issues of identity, ethical complexity, and patient–clinician relationships in modern health care. Additional information about his clinical practice and professional work is available on his website, timothylesacamd.com. His professional profile also appears on his ResearchGate profile, where further publications and details may be found.

Prev

Statistics are not destiny: a story of hope in oncology

June 17, 2026 Kevin 0
…

Kevin

Tagged as: Health IT and AI in Medicine

< Previous Post
Statistics are not destiny: a story of hope in oncology

ADVERTISEMENT

More by Timothy Lesaca, MD

  • When a patient attacks you, it changes your life

    Timothy Lesaca, MD
  • Why health influencers shape patients, not prescriptions

    Timothy Lesaca, MD
  • The Goldwater Rule and the cost of psychiatric silence

    Timothy Lesaca, MD

Related Posts

  • Navigating mental health challenges in medical education

    Carter Do
  • From penicillin to digital health: the impact of social media on medicine

    Homer Moutran, MD, MBA, Caline El-Khoury, PhD, and Danielle Wilson
  • Breaking the silence: the truth about mental health challenges among medical students and why medical schools must take action

    Erin Waldrop
  • Medical training and the systematic creation of mental health sufferers

    Douglas Sirutis
  • Why medical students need health care economics

    Angela Wei
  • The missing piece in medical education: Why health systems science matters

    Janet Lieto, DO

More in Health Technology

  • When the AI diagnosis arrives before the patient does

    Ganesh Asaithambi
  • Generalist physicians and AI are a comparative advantage

    Jeremy Fish, MD
  • Patients are turning to AI because doctors lack time

    Arthur Lazarus, MD, MBA
  • The case for an AI-native health care platform

    Brian Hudes, MD
  • You won the lawsuit. Search still says you lost.

    Tim Brocklehurst, MBA
  • AI medical notes are losing the patient story

    Paul Vance, DO
  • Most Popular

  • Past Week

    • The case for an AI-native health care platform

      Brian Hudes, MD | Health Technology
    • EMR errors get blamed on physicians, not systems

      Dennis Hursh, Esq | Health Policy
    • Why the safest medical AI knows when not to answer

      Timothy Lesaca, MD | Health Technology
    • The hidden link between childhood trauma and addiction

      Ronke Lawal, MBA | Conditions and Diseases
    • Branding a medical practice is not vanity, it is trust

      Ashley Gay | Physician Finance
    • How patient advocacy in the hospital can prevent a stroke

      Ashley Youngdale | Conditions and Diseases
  • Past 6 Months

    • The MCAT requirement persists as a norm, not as a tool

      Aniruth Ananthanarayanan | Medical Education
    • Polycystic ovary syndrome is more than ovarian

      Oluyemisi Famuyiwa, MD | Conditions and Diseases
    • DEA fear is reshaping how doctors prescribe

      Ronald L. Lindsay, MD | Physician
    • Medicare physician pay has fallen 33 percent since 2001

      Kayvan Haddadan, MD | Health Policy
    • DOT ruling protects peanut allergies but not eggs, sesame, or milk [PODCAST]

      The Podcast by KevinMD | Podcast
    • Telemedicine as a career, not a side gig

      AIR Physician Academy | Physician
  • Recent Posts

    • Why the safest medical AI knows when not to answer

      Timothy Lesaca, MD | Health Technology
    • Statistics are not destiny: a story of hope in oncology

      Juan Carden, MD | Physician
    • Stop screening for chronic disease one organ at a time

      Jon Gingrich, MBA | Conditions and Diseases
    • Weight stigma in health care is a health threat

      The Obesity Society | Conditions and Diseases
    • When the right end-of-life care is hardest to access

      Denise Mohess, MD | Conditions and Diseases
    • Detachment is not strength: lessons from dying patients

      Aditya Singh, MD | Physician

Subscribe to KevinMD and never miss a story!

Get free updates delivered free to your inbox.


Find jobs at
Careers by KevinMD.com

Search thousands of physician, PA, NP, and CRNA jobs now.

Learn more

Leave a Comment

Founded in 2004 by Kevin Pho, MD, KevinMD.com is the web’s leading platform where physicians, advanced practitioners, nurses, medical students, and patients share their insight and tell their stories.

Social

  • Like on Facebook
  • Follow on Twitter
  • Connect on Linkedin
  • Subscribe on Youtube
  • Instagram

ADVERTISEMENT

  • Most Popular

  • Past Week

    • The case for an AI-native health care platform

      Brian Hudes, MD | Health Technology
    • EMR errors get blamed on physicians, not systems

      Dennis Hursh, Esq | Health Policy
    • Why the safest medical AI knows when not to answer

      Timothy Lesaca, MD | Health Technology
    • The hidden link between childhood trauma and addiction

      Ronke Lawal, MBA | Conditions and Diseases
    • Branding a medical practice is not vanity, it is trust

      Ashley Gay | Physician Finance
    • How patient advocacy in the hospital can prevent a stroke

      Ashley Youngdale | Conditions and Diseases
  • Past 6 Months

    • The MCAT requirement persists as a norm, not as a tool

      Aniruth Ananthanarayanan | Medical Education
    • Polycystic ovary syndrome is more than ovarian

      Oluyemisi Famuyiwa, MD | Conditions and Diseases
    • DEA fear is reshaping how doctors prescribe

      Ronald L. Lindsay, MD | Physician
    • Medicare physician pay has fallen 33 percent since 2001

      Kayvan Haddadan, MD | Health Policy
    • DOT ruling protects peanut allergies but not eggs, sesame, or milk [PODCAST]

      The Podcast by KevinMD | Podcast
    • Telemedicine as a career, not a side gig

      AIR Physician Academy | Physician
  • Recent Posts

    • Why the safest medical AI knows when not to answer

      Timothy Lesaca, MD | Health Technology
    • Statistics are not destiny: a story of hope in oncology

      Juan Carden, MD | Physician
    • Stop screening for chronic disease one organ at a time

      Jon Gingrich, MBA | Conditions and Diseases
    • Weight stigma in health care is a health threat

      The Obesity Society | Conditions and Diseases
    • When the right end-of-life care is hardest to access

      Denise Mohess, MD | Conditions and Diseases
    • Detachment is not strength: lessons from dying patients

      Aditya Singh, MD | Physician

MedPage Today Professional

An Everyday Health Property Medpage Today

Copyright © 2026 KevinMD.com | Powered by Astra WordPress Theme

  • Terms of Use | Disclaimer
  • Privacy Policy
  • DMCA Policy
All Content © KevinMD, LLC
Site by Outthink Group

Leave a Comment

Comments are moderated before they are published. Please read the comment policy.

Loading Comments...