Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

WTF-Beta

  1. Home
  2. Categories
  3. Off Key - General Discussion
  4. AI hallucinations

AI hallucinations

Scheduled Pinned Locked Moved Off Key - General Discussion
9 Posts 4 Posters 295 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • wtgW Offline
    wtgW Offline
    wtg
    wrote on last edited by wtg
    #1

    Oh, goody!

    An enduring problem with today’s generative artificial intelligence (AI) tools, like ChatGPT, is that they often confidently assert false information. Computer scientists call this behavior “hallucination,” and it’s a key barrier to AI’s usefulness.

    Hallucinations have led to some embarrassing public slip-ups. In February, AirCanada was forced by a tribunal to honor a discount that its customer-support chatbot had mistakenly offered to a passenger. In May, Google was forced to make changes to its new “AI overviews” search feature, after the bot told some users that it was safe to eat rocks. And last June, two lawyers were fined $5,000 by a U.S. judge after one of them admitted he had used ChatGPT to help write a court filing. He came clean because the chatbot had added fake citations to the submission, which pointed to cases that never existed.

    But in good news for lazy lawyers, lumbering search giants, and errant airlines, at least some types of AI hallucinations could soon be a thing of the past. New research, published Wednesday in the peer-reviewed scientific journal Nature, describes a new method for detecting when an AI tool is likely to be hallucinating. The method described in the paper is able to discern between correct and incorrect AI-generated answers approximately 79% of the time, which is approximately 10 percentage points higher than other leading methods. Although the method only addresses one of the several causes of AI hallucinations, and requires approximately 10 times more computing power than a standard chatbot conversation, the results could pave the way for more reliable AI systems in the near future.

    https://time.com/6989928/ai-artificial-intelligence-hallucinations-prevent/?utm_placement=newsletter

    1 Reply Last reply
    • ShiroKuroS Offline
      ShiroKuroS Offline
      ShiroKuro
      wrote on last edited by
      #2

      Last fall, I had my students do an AI activity in class, and this was one of the things I really drilled into them, that false information from AI is really hard to detect because it seems legit. Often, it’s plausible, which is part of it, but more than that, it’s because the language is so well-constructed, i.e., there are no grammatical mistakes, the word choice is not only good but natural, the response fits the query quite well etc. Oh, and one other thing: AI output in response to a query is almost always pretty long. Rather than one or two sentences, you often get several paragraphs, and the sheer amount of text can sometimes be overwhelming.

      And because of all that, the user has a hard time approaching it with an appropriate degree of suspicion, which ends up making the user gullible and almost too easily convinced. And if your goal is to not be fooled by AI, there’s sort of an embedded catch 22 there, because the most reliable way to detect an AI hallucination is when the topic and content are things the user knows, but the catch 22 is that if you already know, you’re not going to be asking AI.

      BTW here’s the activity we did: in the latter half of the semester, I had students work in groups to put together a set of essay type questions based on material covered in the first half of the semester. The idea was that this was all information and ideas that the students knew very well, because we’d spent the last 8-10 weeks talking about it. After crafting all the questions and creating rubrics with the kinds of info/content they’d want to see in the answers, each group asked two questions to AI and then evaluated and scored the answers.

      Students were very critical of the quality of the answers. There weren’t a lot of flat-out wrong answers, but AI scored low across the board for things like lack of depth, missing the main point of the questions, and so on. The students pretty much felt that the AI answers seemed like something coming from someone who didn’t really know anything about the subject and was just trying fake it by throwing out a lot of commonly known tidbits.

      I won’t teach that class again until Spring 2025, so it will be interesting to see if 1) this kind of exercise is still relevant, and 2) how well the AI performs if we do it.

      1 Reply Last reply
      • AxtremusA Offline
        AxtremusA Offline
        Axtremus
        wrote on last edited by
        #3

        No idea if ChatGPT or some other generative AI is involved.

        Link to video

        1 Reply Last reply
        • wtgW Offline
          wtgW Offline
          wtg
          wrote on last edited by
          #4

          A student in America asked an artificial intelligence program to help with her homework. In response, the app told her "Please Die." The eerie incident happened when 29-year-old Sumedha Reddy of Michigan sought help from Google’s Gemini chatbot large language model (LLM), New York Post reported.

          The program verbally abused her, calling her a “stain on the universe.” Reddy told CBS News that she got scared and started panicking. “I wanted to throw all of my devices out the window. I hadn’t felt panic like that in a long time to be honest,” she said.

          The assignment Reddy was working on involved identifying and solving challenges that adults face with age. The program blurted out words that hit the student hard and were akin to bullying

          https://www.wionews.com/world/please-die-you-are-a-stain-on-the-universe-ai-chatbot-tells-girl-seeking-help-for-homework-776635

          1 Reply Last reply
          • wtgW Offline
            wtgW Offline
            wtg
            wrote last edited by
            #5

            Hallucinated citations are polluting the scientific literature. What can be done?
            Tens of thousands of publications from 2025 might include invalid references generated by AI, a Nature analysis suggests.

            https://www.nature.com/articles/d41586-026-00969-z

            1 Reply Last reply
            • ShiroKuroS Offline
              ShiroKuroS Offline
              ShiroKuro
              wrote last edited by
              #6

              Oh wow, this thread was stated two years ago!! And the hallucination problem doesn’t seem much changed. I don’t know that it’s necessarily worse, but the scale of AI use means that even 20%, or 10% hallucinations is a huge number.

              1 Reply Last reply
              • J Offline
                J Offline
                jon-nyc
                wrote last edited by
                #7

                It's gotten much better. As an academic you should be aware that those tens of thousands of publications in 2025 were written ~2023. (though maybe linguistics is different than science)

                ShiroKuroS 1 Reply Last reply
                • J jon-nyc

                  It's gotten much better. As an academic you should be aware that those tens of thousands of publications in 2025 were written ~2023. (though maybe linguistics is different than science)

                  ShiroKuroS Offline
                  ShiroKuroS Offline
                  ShiroKuro
                  wrote last edited by
                  #8

                  @jon-nyc said:

                  As an academic you should be aware that those tens of thousands of publications in 2025 were written ~2023.

                  Good point, I wasn't really thinking about that. (ling is mostly the same, not any faster turn around time than other social sciences).

                  But with this:

                  the hallucination problem doesn’t seem much changed.

                  I wasn't thinking about the academic article issue, more so just the hallucination problem.

                  I think AI has gotten better, in many ways. And I say that not just based on claims made by the companies producing the products, but from my own usage (for both research and teaching purposes).

                  So AI itself is improving. But IMO the hallucination problem has not decreased/improved, even if the percentages are improving, the problem persists. And what I wrote above (two years ago!) continues to be a problem. Namely, that the tasks people use AI for predispose them to be vulnerable to hallucinations because they're using AI for something they don't know, or can't do (or can't do fast enough) on their own.

                  Oh, but here's something that has changed (improved) in the last two years, which is tools like Notebook LM , which use the RAG model. This "retrieval model" has significantly fewer hallucinations (though not zero) because of how it works.

                  But ChatGPT continues to be the most widely used product, and it's the worst in terms of hallucinations because of how functions (e.g., not saying when it doesn't know, etc.)

                  So @jon-nyc do you use AI for any professional purposes? Which product(s), and for what kinds of tasks?

                  1 Reply Last reply
                  • ShiroKuroS Offline
                    ShiroKuroS Offline
                    ShiroKuro
                    wrote last edited by
                    #9

                    BTW from yesterday's NYT:

                    How Accurate Are Google’s A.I. Overviews?
                    The company’s A.I.-generated answers look authoritative, but they draw on an array of sources, from trustworthy sites to Facebook posts.

                    Regular link: https://www.nytimes.com/2026/04/07/technology/google-ai-overviews-accuracy.html

                    Gift link: https://www.nytimes.com/2026/04/07/technology/google-ai-overviews-accuracy.html?unlocked_article_code=1.ZVA.gVON.lCi51gwbSqxm&smid=url-share

                    1 Reply Last reply

                    Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                    Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                    With your input, this post could be even better 💗

                    Register Login
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    Powered by NodeBB | Contributors
                    • Login

                    • Don't have an account? Register

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • Users
                    • Groups