Search engines like Google and Bing, and AI chatbots like ChatGPT and GPT-4, are now major sources of health information. But how reliable are they? A new research published in the NPJ Digital Medicine tested four major search engines and seven different large AI models, including leading programs like ChatGPT and GPT-4, by asking them 150 medical questions. Their study looked at how accurate the answers were, how much the results changed based on how the question was asked, and whether giving the AI access to search results helped. Which Gave Better Answers: AI or Search Engines? While AI Chatbots, with 80% accuracy, generally outperformed Search Engines, with 50-70% accuracy, on direct health questions, the study found that AI chatbots are good, but their mistakes are worrying. Confidence in Errors The biggest and most dangerous problem was that the AI sometimes gave confidently wrong answers that directly disagreed with established medical facts. This is highly risky in a health setting. Overall Accuracy The AI chatbots generally did better than search engines, correctly answering about 80% of the questions. The best performers were typically GPT-4, ChatGPT, Llama3, and MedLlama3. Precision Problem Search engines like Google usually return answers that are correct when they directly address the question, but they often clutter the results with information that is incomplete or off-topic. They struggled with giving a straight "yes" or "no" answer. User Habits The study simulated a "lazy" user, who just trusts the first answer, and a "diligent" user, who checks three sources. Surprisingly, the "lazy" users were sometimes just as accurate as the diligent ones, suggesting that top-ranked results are often good, but this is a risk if a highly ranked answer happens to be wrong. Bing was the best among search engines, but it wasn't significantly better than Google, Yahoo!, or DuckDuckGo. How You Ask The Question Matters The AI's accuracy was highly sensitive to how the question was phrased. Using an "expert" prompt like asking the AI to cite reputable medical sources generally led to better, more medically sound answers, even if they were sometimes less direct. Giving the AI the top search results before it answered (retrieval augmentation) usually improved performance, especially for smaller AI models. However, this didn't always help; if the search results given to the AI were irrelevant or low-quality, the AI's answer could actually get worse. More information isn't always better. Things to Keep In Mind Some points noted by the researchers was, Questions about COVID-19 were easier for both AI and search engines, likely because of the huge amount of data available about the pandemic.While AI is powerful, its tendency to be swayed by how you word the question and its confident errors mean we need to be very cautious about using it for medical advice.The overall conclusion is that AI models are promising for health information, but they are not yet fully reliable on their own. Combining them with high-quality search results is likely the best way forward, but only if the search results fed to the AI are trustworthy.