As in the title. I know that the word jailbreak comes from rooting Apple phones or something similar. But I am not sure what can be gained from jailbreaking a language model.

It will be able to say “I can’t do that Dave” instead of hallucinating?
Or will only start spewing less sanitary responses?

  • INeedMana@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I think you’re speaking about jailbreaking a phone, while my question was about jailbreaks in language models (AI, like ChatGPT)

    • FiveMacs
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Interesting…I have some reading to do. Thx