• stickyprimer@lemmy.world
    link
    fedilink
    English
    arrow-up
    27
    ·
    1 day ago

    91% accuracy is the kind of thing that may sound good… hey! It’s an A minus! But it’s actually completely, totally unacceptable. Imagine if the turn signal wand on your car operated with 91% accuracy. About one in every ten times it would light up the wrong direction. How many accidents are we causing? A lot.

    • mabeledo@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      14 hours ago

      Even the number is a bit misleading. First of all, anyone who has ever done LLM benchmarking knows that this isn’t an exact science, at all. You can totally get a 99% on a benchmark and fail every single task on another.

      But even this particular claim is nuanced. From the original article:

      But with Gemini 3, Google’s A.I.-generated answers were more likely to be ungrounded than when the system was based on Gemini 2, meaning the websites they linked to did not completely support the information they provided. In October, correct answers were ungrounded 37 percent of the time. In February, with Gemini 3, that figure rose to 56 percent.

      See https://www.nytimes.com/2026/04/07/technology/google-ai-overviews-accuracy.html

      Meaning that 56% of the time, users cannot even verify the information given by the LLM with the sources the LLM claims it’s using.

    • Impractical_Island@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      4
      ·
      1 day ago

      This is why we should ban cars outright. Go back to writing on paper. I can stick a pen in my ass and make a cute drawing of a cat. In fact, I might be able to eat a cat and defecate it later, to make it more realistic. And that’s what we need to be; realistic.

      (This comment is about AI data centers)

      • Impractical_Island@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        3
        ·
        1 day ago

        I make this “comment” every once and a while because I called someone out on how their post made little sense by parodying it, and now I just do this.

          • Impractical_Island@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            14 hours ago

            I’m drawing attention to my educational (f)art project while simultaneously goading someone who thought a less-hyperbolous but still nonsensical analogy was the greatest tweet anyone’s ever made. I mean, I remember the first time something I did got seen by millions, so I can understand their enthusiasm to defend it, at the same time, we’re still talking about AI data centers, right? I am, at least.

    • Lovable Sidekick@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      2
      ·
      1 day ago

      Whether 91% accuracy is acceptable depends on how unacceptable the 9% inaccuracy is. If 91% of the information in your term paper is correct you’ll probably get a decent grade, but if you only kill 91% of cancer cells the surviving 9% will grow a treatment-resistant tumor and you’ll probably die. This makes percentages essentially useless - more important is how badly wrong the worst wrong result is.