While generative AI has demonstrated potential in assisting doctors with tasks like note-taking and scan analysis, the technology is still in its early stages and faces barriers to widespread implementation in healthcare settings:
- AI-powered ambient note-taking software has improved patient and provider satisfaction in pilot programs by allowing doctors to engage more directly with patients, but some physicians find the technology cumbersome or struggle with providing the right level of detail verbally.
- AI models have shown promise in analyzing medical scans and documenting abnormalities, potentially catching issues early and saving lives, but integration into clinical workflows is still limited.
Challenges for large language models (LLMs) in healthcare: LLMs designed for general use often struggle to provide reliable, evidence-based answers to medical questions, while healthcare-specific models face limitations due to the lack of data on novel treatments and underrepresented patient populations:
- General purpose LLMs like ChatGPT frequently “hallucinate” citations, providing irrelevant or inaccurate information up to 80% of the time when asked medical questions.
- Even healthcare-focused models like ChatRWD and OpenEvidence can only answer 70% of physician questions reliably when working together, as they lack sufficient data on cutting-edge treatments and patients with complex comorbidities.
Cautious optimism for AI’s impact: Despite the challenges, many physicians and health system executives remain hopeful about the potential of generative AI to improve efficiency, patient care, and financial performance:
- Jeffrey Sturman, chief digital information officer at Memorial Healthcare System, believes AI can significantly reduce physician burnout by automating time-consuming documentation tasks, with Memorial already seeing promising results from Microsoft and Nuance’s DAX ambient note-taking tool.
- Dr. Saurabh Gombar, co-founder of Atropos Health, expects healthcare-specific LLMs to reach 90-95% reliability in the next 1-2 years, at which point the remaining unanswerable questions will likely pertain to extremely novel treatments or unique patient cases.
Adjusting workflows is key: To maximize the value of generative AI tools, experts emphasize the need for physicians and health systems to thoughtfully integrate the technology into existing processes:
- Dr. Thomas Maddox of BJC HealthCare notes that simply layering new AI tools on top of current workflows is insufficient; doctors must adjust their own practices and adapt to the capabilities and limitations of the technology to reap the full benefits.
- A careful, measured approach to AI adoption, including thorough vetting of vendors and tools, will be critical for health systems navigating the hype and noise surrounding the technology.
Analyzing the road ahead: As the capabilities of generative AI continue to evolve rapidly, its role in medicine is likely to expand, but not without growing pains. Striking the right balance between human expertise and artificial intelligence will be an ongoing challenge, requiring close collaboration between clinicians, technology developers, and health system leaders. While the transformative potential of AI in healthcare is clear, realizing its full impact will depend on a thoughtful, evidence-based approach to implementation that prioritizes patient safety, physician buy-in, and seamless integration into clinical workflows.
Where generative AI is working for doctors—and where it's falling short