How Reliable is AI in healthcare?

There are many new AI tools available for healthcare professionals, from transcription and imaging to diagnostics — many tout accuracy rates above 90%, but most are tested only in isolation.

Those tools become less reliable when used together, an analysis by Korean AI scientist Kwansub Yun suggests.

Yun and health consultant Claire Hast ran an example scenario in which a patient had a physical transcribed by AI, received a mammogram using AI-assisted imaging and got a diagnosis with help from an AI tool.

While each tool individually had a reported accuracy rating of more than 85%, the system as a whole had a reliability score of just 74%.

Yun used a systems-level analysis to estimate the overall workflow reliability of the three tools used together.

Drawing on publicly available accuracy data for an imaging tool (90%), a documentation tool (85%) and a diagnostic (97%), Yun arrived at a reliability score of 74%.
“The formula is a standard reliability engineering heuristic — the same structural logic used to estimate system reliability in aerospace and defense,” says Yun.

If erroneous data from one AI tool is fed into another, the secondary tool has no way to flag the unreliable inputs, says Yun.

“The result looks authoritative, but the chain that produced it was never measured end to end.”

That’s particularly troubling given that the standard regulatory procedure for evaluating the tools involves standalone model performance testing, Hast and Yun say.

“What no one is currently required to measure is the reliability of the full workflow that the model sits inside,” Yun says.

Human doctors are also typically evaluated as individuals, not as part of a broader system — there’s no data on how much reliability slips as patients move between providers.

“If you chain together the probabilities of accuracy for any human making many sequential decisions, you realize how likely you are to get errors,” says Mark Sendak, CEO of AI infrastructure and evaluation startup Vega Health.
“My fear is that we’re going to hold AI to a standard of perfection that is clearly not the standard that we hold the existing medical system to,” says UC San Francisco Department of Medicine chair Robert Wachter.

More attention should be paid to the overall performance of what Wachter calls “the human-AI dyad.”

For example, AI tools could be designed to more clearly signal to humans in the loop where their clinical reasoning is needed.
In such a scenario, AI findings made with 100% confidence could be colored green, while those with less confidence be colored yellow or orange.
Such a setup would better enable regulators and evaluators of such tools to look at “that dyad and its actual outcomes, rather than just assuming the human-in-the-loop adds safety,” Wachter says.

When it comes to AI in health care, “we have no data or oversight on the orchestra of it all,” says Hast.

Wayne Creed

Administrator

View All Posts

Cherri on 343,000 Visitors, One Small Town: Cape Charles Grapples With Its Own SuccessMarch 28, 2026
The Bay’s saving grace may be the organizations already in place to protect it—pushing back against the greed that uses…
Cherri on Cape Charles Main Street Looking to fill Key RolesMarch 28, 2026
What's Virginia Department of Housing got to do with the town making money and hiring? They too need to do…
Cherri on Opinion: When the Comprehensive Plan Is Ignored, the Courts Step InMarch 28, 2026
When official courthouse documents are denied, that is flat out CORRUPTION. I guess they need to get paid off so…
Monty Rathburne on Opinion: When the Comprehensive Plan Is Ignored, the Courts Step InMarch 26, 2026
Cheryl, you're close, but this is the Eastern Shore of Virginia, actually the Developers get their way around here. They…
Roger Wilco on Cape Charles Police Notes March 22March 24, 2026
Cape Charles Police Department does not have a Sheriff. Never has... Never will

Leave a Reply Cancel reply

Related Stories

Jim Baugh Outdoors: Black Sea Bass and Trigger Fish

Free Panel Discussion and Resource Expo on Mental Health Across the Lifespan

SAIL250® VIRGINIA SPEAKER SERIES: An Evening with Bestselling Author Ian Toll

Search

Subscribe to the Mirror

Join the Conversation!

Mirrors

About the Author