• FiskFisk33@startrek.website
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    2 days ago

    A screwdriver beat a hammer in a screw driving competition. well done.

    Maybe it sounds impressive when hammers are super hyped and everyone and their mother is driving their screws with them, but it honestly really isn’t.

  • TheAlbatross@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    52
    ·
    3 days ago

    Gotta remember that AI isn’t occasionally hallucinating and often recalling information, it’s never recalling information, it’s always hallucinating, just that we say we like some of the hallucinations, so it does those more often.

    • The Picard Maneuver@lemmy.worldOP
      link
      fedilink
      arrow-up
      6
      arrow-down
      2
      ·
      3 days ago

      I wouldn’t be surprised if it’s literally zero. I’ve tried with a few LLMs, and they’re all very confident that they know how to play chess, but they just start hallucinating illegal moves immediately.

      • la_scriba@sopuli.xyz
        link
        fedilink
        arrow-up
        2
        ·
        2 days ago

        Immediately? When was the last time you tried? The newer models can hold a game well for 10-20 moves.

        • The Picard Maneuver@lemmy.worldOP
          link
          fedilink
          arrow-up
          1
          ·
          2 days ago

          A few weeks ago, Gemini got confused when it tried to go first as black multiple times, so that’s the most immediate one I can remember. Last week, chatGPT offered to set up chess puzzles for me, but it made mistakes 3 out of 3 times.

          Maybe I’ll try again. Is there a certain one you’ve seen good performance out of?

  • Fredselfish@lemmy.world
    link
    fedilink
    arrow-up
    3
    arrow-down
    1
    ·
    3 days ago

    To be fair the Atari was built to play games. /s

    But for real wasn’t AI supposed to be bare minimum good at this game. Is this not how were to train them in order to know if they are intelligent or not?