How to spot AI 'hallucinations'

hilshpe
Dec 17, 2024
4 min read

How to spot AI hallucinations

Studying in the past was probably easier than it is now. In the past, the student could blindly trust their teachers and textbooks. Nowadays, much of the information students are presented with may not be correct. Websites may contain incorrect information, guidelines may be outdated, literature may come from predatory journals or may simply have escaped in a review process and, of course, online media often contain disinformation as well. AI is not immune for this. Information generated by AI can contain ‘hallucinations’, reasoning that is plausible and still incorrect. When interacting with HPE-bot, these hallucinations can occur as well, even if modern LLM models are less prone to hallucinations than the older ones.

So how do you spot possible incorrect information whether it is from a traditional medium or AI generated. There is no 100% failsafe recipe but from years of serving on many item review panels I have learned some insights that I want to share with you. By the way, although I am discussing AI powered tools here, it also applies to any other source of information, whether spoken, written or as videos.

1. Too absolute statements

Make it a habit to mistrust Information that claims that something is always or never true. In any health profession, things are contextual. Things may be true or work in one situation but may not be true or may not work in another situations. Reality in every healthcare context is always more nuanced. Crosscheck such information. If possible, using other sources of information is best in this case. These might be textbooks or other scientific publications. Bear in mind that Consensus.ai, Perplexity.ai or SciSpace CoPilot can be very helpful in finding these other sources.

2. Too good to be true.

Information that sounds too good to be true often is not true or only partial. Be wary of suggestions of simple solutions to complex problems. A simple treatment for a complex illness, single clinical findings or symptoms that are claimed to be pathognomonic are often untrue and simple clinical decision algorithms or may undersell the complexity of reality. Check various sources and determine whether they agree and reality is really that simple.

3. Overly complex language

If the information from the Bot uses too much fancy or complex Language. If the reply you receive from the AI bot overly uses technical jargon or uncommon terms in a way that feels excessive or out of place, that possibly masks a lack of substance. So, if the explanation sounds impressive but doesn't clarify the topic for the user it is worth checking it. In that case, it is a good idea to check with independent sources. Of course, this can be textbooks, but you can also enter a query into AI systems that use the existing literature like Consensus, Perplexity, SciSpace, etc.

4. Monocultural information

Information that is typically only from one cultural background may be correct for that context but not be the whole picture. Especially when it comes down to communication, patient education and establishing adherence to therapy, culture matters. Explore different perspectives, always. You can use HPE-bot for that but simply asking it to provide solutions or insights using a different perspective (e.g., “What would be the best solution for this problem if you were to approach it from a Japanese cultural background?”)

5. Use common sense.

Asking yourself the simple question “can this be true?” and then continue asking the bot additional questions will help to elucidate incorrect information. Never switch off you ‘common sense radar’.

6. Illogical or incoherent structure of the reply.

If the explanation or argument the LLM provides doesn’t follow a clear, coherent and logical structure this may be a sign that it is hallucinating. If the LLM’s replies jump between unrelated points or when it provides information that is not directly relevant to your question that is a red flag. Human common sense must come into play and you may want to challenge the bot with further questioning. Dig deeper and challenge whether things can be true. If you then receive contradictory answers, check other sources.

7. Inconsistency Across Similar Questions If you ask similar questions around the same topic and the LLM provides different answers to these closely related questions, especially when asked in the same session this is a serious red flag. It can either mean that one of the pieces of information is plain wrong or that there are different perspectives on the topic. The advice is to ask the LLM to explain the differences between the different answers to help you determine whether it is really a matter of different perspectives.

8. Generic or Template-Like Responses

If the response is overly generic and does not really pertain to your question or comment. If it is written in a way that could apply to any situation, it is likely to be a plausible but incorrect answer.

9. Overproduction of Details

If the response includes an excessive amount of minor or irrelevant details, which can make the response seem convincing but may hide inaccuracies. Also, if the response feels unnecessarily long without directly addressing the question.

10. Lack of Cross-Referencing

When asked to provide sources or explain its reasoning, the AI either avoids the question, cites unverifiable references, or doubles down on unsupported claims. The references you get from an AI bot may be correct but more often than not they aren’t. Never rely on them without checking them and prove that they really exist and support the answer.

11. Misuse of technical Terminology

If the replies contain misapplied or fabricated technical terms, often in ways that sound plausible but are incorrect.

How do you test for hallucinations

Ask follow-up questions: Test the AI’s consistency by asking about the same topic in different ways.

· Use the clinical decision-making function (the question mark icon) and the critical reasoning function (the smiley icon). Let the bot produce question and answer them to the best of your ability with the knowledge in the responses you have received from the bot and study the feedback. Because the AI generates the content continuously, asking it from another context, like the decision making or clinical reasoning questions, will help you to spot whether the information you are getting is consistent.

Request an explanation: If the AI cannot provide a clear, logical rationale, it may be hallucinating.
Check for external resources: If the AI can't suggest where to verify its claims, that's a potential red flag.

How to spot AI 'hallucinations'

Recent Posts

Comments

Leave your email here for News & Updates