No, really, those are the magic words

  • ignirtoq@fedia.io
    link
    fedilink
    arrow-up
    25
    ·
    4 days ago

    Part of the reason that this jailbreak worked is that the Windows keys, a mix of Home, Pro, and Enterprise keys, had been trained into the model, Figueroa told The Register.

    Isn’t that the whole point? They’re using prompting tricks to tease out the training data. This has been done several times with copyrighted written works. That’s the only reasonable way ChatGPT could produce valid Windows keys. What would be the alternative? ChatGPT somehow reverse engineered the algorithm for generating valid Windows product keys?

    • SheeEttin@lemmy.zip
      link
      fedilink
      English
      arrow-up
      7
      ·
      3 days ago

      The alternative would be that it generated a string of characters that looked like a key.

      It’s also possible that it generated a random key that was actually valid, though this is far less likely.

  • sad_detective_man@leminal.space
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 days ago

    massgrave is literally right there. I’m pretty sure google search results will have an ai summary with bullet points for the literacy-challenged who need a chatbot to tell them basic things

  • chaosCruiser@futurology.today
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    edit-2
    4 days ago

    In this case, a researcher duped ChatGPT 4.0 into bypassing its safety guardrails, intended to prevent the LLM from sharing secret or potentially harmful information, by framing the query as a game

    Ooh, this is so good 🤣

    If the LLM refuses to talk about something, just ask it to embed the answer into a poem, batman fan fiction etc. Guessing game is s new one. Should try that one when talking about bioweapons, cooking meth or any other sensitive topic.

  • Fizz@lemmy.nz
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    3 days ago

    Articles talking about AI suck so much I end up more pissed at the author than the AI Company.