Lemmy.ca
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Lemmit.Online bot@lemmit.onlineMB to /r/Technology@lemmit.onlineEnglish · 1 year ago

Poisoned AI went rogue during training and couldn't be taught to behave again in 'legitimately scary' study

www.livescience.com

external-link
message-square
0
link
fedilink
  • cross-posted to:
  • [email protected]
2
external-link

Poisoned AI went rogue during training and couldn't be taught to behave again in 'legitimately scary' study

www.livescience.com

Lemmit.Online bot@lemmit.onlineMB to /r/Technology@lemmit.onlineEnglish · 1 year ago
message-square
0
link
fedilink
  • cross-posted to:
  • [email protected]
AI researchers found that widely used safety training techniques failed to remove malicious behavior from large language models — and one technique even backfired, teaching the AI to recognize its triggers and better hide its bad behavior from the researchers.
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/technology by /u/ethereal3xp on 2024-01-27 08:40:05.

alert-triangle
You must log in or register to comment.

/r/Technology@lemmit.online

technology@lemmit.online

Subscribe from Remote Instance

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]
lock
Community locked: only moderators can create posts. You can still comment on posts.

Subreddit dedicated to the news and discussions about the creation and use of technology and its surrounding issues.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 1 user / day
  • 1 user / week
  • 1 user / month
  • 63 users / 6 months
  • 6 local subscribers
  • 194 subscribers
  • 12.4K Posts
  • 103 Comments
  • Modlog
  • mods:
  • Lemmit.Online bot@lemmit.online
  • UI: 0.19.11
  • BE: 0.19.11-n.1
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org