Layanan Mesin Perakitan Otomatis yang Disesuaikan Sejak 2014 - RuiZhi Automation

Potential Harms Caused by AI Hallucinations and How to Avoid Them

In the medical field, “accuracy” has never been a choice—it is a lifeline. When doctors rely on diagnostic recommendations generated by AI, or nurses refer to medical record information summarized by AI, even the slightest deviation may affect patients’ health and even their lives. However, current artificial intelligence, especially Large Language Models (LLMs), has a hidden yet fatal risk: “hallucination”—when a model cannot find an accurate answer, it will fabricate content that seems reasonable but is completely wrong. Such “confident errors” may be just a minor inconvenience in daily scenarios, but in the medical field, they could turn an “80% accuracy rate” into a “20% fatal risk”.

“Even if these systems are correct 80% of the time, it means they are wrong 20% of the time,” said Dr. Jay Anders, Chief Medical Officer, describing the risks of AI errors and outlining some protection strategies for providers.

Medical systems are adopting AI tools to help clinicians streamline charting and care plan creation, thus saving their precious time every day.
But what impact would it have on patient safety if AI makes a wrong judgment?
Even the most common users of ChatGPT and other generative AI tools based on large language models encounter errors—often referred to as “hallucinations”.
AI hallucinations occur when an LLM cannot find a suitable answer and has to make one up. Essentially, when an LLM does not know the correct answer or cannot find appropriate information, it will fabricate an answer instead of acknowledging uncertainty.

These fabricated responses are particularly problematic because they are usually very convincing. Depending on the content of the question, these hallucinations can be difficult to distinguish from factual information. For example, if an LLM cannot find the correct medical code for a specific condition or procedure, it may make up a number.

The core issue is that LLMs are designed to predict the next word and provide answers, not to respond when information is insufficient. This creates a fundamental contradiction between the technology’s drive to assist and its tendency to generate seemingly plausible but actually inaccurate content when faced with uncertainty.

To learn more about AI hallucinations and their potential impact on healthcare, we recently interviewed Dr. Jay Anders, Chief Medical Officer of Medicomp Systems. Medicomp Systems is a provider of evidence-based clinical AI systems, dedicated to using data for connected care and enhanced decision-making. He plays a key role in product development and serves as a liaison to the healthcare community.

Q: What does AI’s ability to hallucinate mean for healthcare clinical and administrative staff who want to use AI?
A: The impact is quite different between clinical and administrative applications. In clinical medicine, hallucinations can cause serious problems because accuracy is non-negotiable. I recently read a study showing that AI summaries have an accuracy rate of about 80%. That might get you a B- in college, but in healthcare, a B- simply doesn’t work. No one wants B- healthcare—they want A-level care.

Let me give some specific examples of clinical record summarization, a technology that many healthcare IT companies are rushing to apply. When AI summarizes clinical records, it can make two serious mistakes. First, it may fabricate information that doesn’t exist at all. Second, it may misattribute illnesses—attributing a family member’s condition to the patient. So, if I mention “my mother has diabetes”, AI may record that I have diabetes.

AI also has problems with context recognition. If I’m discussing a physical exam, it may introduce elements that have nothing to do with it. It can’t understand what we’re actually talking about.

For administrative tasks, the risks are generally lower. If AI makes a mistake in equipment inventory, drug supply, or scheduling, while problematic, these errors do not directly harm patients. There is a fundamental difference in risk when dealing with clinical documentation versus operational logistics.

Q: What negative consequences can hallucinations in medical AI have? How do they spread through processes and systems?
A: Negative outcomes have a cascading effect at multiple levels and are extremely difficult to reverse. When AI misclassifies a wrong disease, lab result, or medication into a patient’s medical record, these errors are almost impossible to correct and can have devastating long-term consequences.

Imagine such a scenario. If AI incorrectly diagnoses me with leukemia based on my mother’s medical history, how can I get life insurance? Would employers be willing to hire someone they think has active leukemia? These errors have direct and long-term impacts that extend far beyond the medical field.

The spread problem is particularly insidious. Once incorrect information enters the medical record, it is copied and shared across multiple systems and providers.

Even if I, as a doctor, spot the error and document a correction, the original record has already been sent to many other healthcare organizations, and they won’t receive my correction. It’s like a dangerous game of telephone—errors spread throughout the healthcare network, and each iteration makes tracking and correction more difficult.

This leads to two types of spread: the spread of actual errors and the erosion of system trust. I’ve seen AI-generated summaries that can’t even maintain consistency in a patient’s gender within a single document—referring to someone as “he”, then “she”, then “he” again.

When lawyers encounter such inconsistencies in legal proceedings, they will question everything: “If it can’t determine whether someone is male or female, how can we trust any information?”

The issue of trust is crucial because once confidence in AI-generated content is eroded, even accurate information may be seen as unreliable.

Q: What measures can hospitals and health systems take to avoid the negative consequences of hallucinations when using AI tools?
A: Healthcare organizations need to implement AI strategically, rather than throwing the technology at every problem like “mud on a wall”. The key is targeted, purposeful deployment, along with strong human oversight.

First, clearly define what problem you are trying to solve with AI. Are you solving a clinical diagnosis problem, or managing drug inventory? Don’t jump into high-risk clinical applications without understanding what the technology can and cannot do.

I know of a vendor that deployed an AI sepsis detection system with an error rate as high as 50%. The hospital’s CEO, who is a friend of mine, simply shut down the system because they realized they didn’t have a serious sepsis problem in the first place.

Second, choose your AI tools carefully. Different models excel at different tasks. What GPT-4 is good at, Claude may not be, and vice versa. Validate the technology with your own data and patient population. Vendors should provide the confidence level of their systems, whether their accuracy rate for your specific use case is 90%, 95%, or only 20%.

Most importantly, always maintain human oversight. AI should augment human processes, not replace them. Be sure to involve humans to verify the validity of AI outputs before implementing or documenting them. This applies whether you’re dealing with billing, coding, or clinical decision-making. When humans identify AI errors, this feedback can help the system continuously improve.

The current environment is like a “Dodge City”. Everyone is using AI for everything without proper validation or safeguards. This “AI for AI’s sake” mentality is dangerous. Not all processes need AI.

If a patient comes to my clinic with symptoms such as a low fever, sore throat, and runny nose, I don’t need AI to determine that it’s probably a viral infection. Some situations are inherently simple, and adding AI complexity will only increase costs and the possibility of errors.

Q: What should CIOs, CAIOs, and other IT leaders in the healthcare industry ask vendors whose tools are equipped with AI to prevent hallucinations?
A: IT leaders need to ask direct, specific questions about validation and performance. Start with the basics: What level of confidence can your system achieve in terms of accuracy? Can you demonstrate your AI’s performance with real healthcare data similar to ours? Don’t fall for vague promises—demand evidence of specific performance metrics.

Ask about training data and validation processes. How was the AI model trained? What types of medical information were used? Has the system been specifically tested for the clinical scenarios you plan to implement? Different AI models have different strengths, so make sure the vendor’s system matches your intended use case.

Ask about human oversight mechanisms. How does the vendor recommend integrating human validation into their workflow? What safety measures are built into the system to flag potential problematic outputs? Vendors should provide clear recommendations for maintaining human oversight, rather than encouraging full automation.

Request information about error detection and correction processes. When hallucinations occur (and they inevitably will), how quickly can they be identified and corrected? What mechanisms are in place to prevent errors from spreading between systems? How does the vendor use feedback to continuously improve their model?

Finally, be wary of vendors that promise revolutionary features that seem too good to be true. Some companies are developing “doctor-replacer” chatbots or complex multi-LLM systems that claim to outperform clinicians. Even if these systems are correct 80% of the time, it means they are wrong 20% of the time. Would you be willing to be part of that 20%?

Our goal is not to avoid AI entirely. The technology can bring benefits when used properly. But we need to implement it cautiously, with appropriate safeguards, and always under human supervision. The risks in healthcare are simply too high, and any approach to deploying AI must be prudent and validated.

The value of AI in the medical field has never been to replace human judgment, but to be a reliable “assistant”—which means we must first tame the hidden worry of “hallucination”. The core of avoiding harm does not lie in pursuing a “zero-error” perfect model, but in establishing “bounded applications” and “traceable verification”: clarifying where AI should be used (such as administrative assistance) and where it should not (such as high-risk diagnosis); verifying performance with real medical data and rejecting vague “accuracy commitments”; keeping human supervision throughout the process and making “human checks” an indispensable procedure.

After all, the essence of medical care is “people-oriented”. AI can save time and optimize processes, but it can never replace the ultimate pursuit of “accuracy”—because in the face of patients’ lives and health, there is no room for tolerance for any “hallucination”. Only by keeping technology operating within a safety framework can AI truly become a help to medical care rather than a hidden danger.

What help has artificial intelligence brought to auto injector assembly?
The working principle of auto injector assembly

Share:

More Posts

Send Us A Message

Related Product

E-mail
Surel:644349350@qq.com
Ada apa
WhatsApp Saya
Ada apa
Kode QR WhatsApp