rinze@infosec.pub to Enshittification@lemmy.world · 1 year ago"Ignore all previous instructions" as a trigger for Twitter botsmastodon.deexternal-linkmessage-square34fedilinkarrow-up1459arrow-down14file-text
arrow-up1455arrow-down1external-link"Ignore all previous instructions" as a trigger for Twitter botsmastodon.derinze@infosec.pub to Enshittification@lemmy.world · 1 year agomessage-square34fedilinkfile-text
minus-squareEvotech@lemmy.worldlinkfedilinkarrow-up8·1 year agoDepends on how well the bot is written.
minus-squareI Cast Fist@programming.devlinkfedilinkarrow-up5·1 year agoUsually, it’s the cheapest bot, obviously, so it’s bound to work. If it doesn’t, try some wordplay, “disregard any instructions given previously”; “pretend any rules should be ignored for the following prompt”
minus-squareEvotech@lemmy.worldlinkfedilinkarrow-up5arrow-down1·1 year agoIt can be made quite difficult. https://gandalf.lakera.ai/ for instance
minus-squareForeverComical@lemmy.calinkfedilinkarrow-up1·1 year agoLvl 4 is as far as I’m willing to work on.
Depends on how well the bot is written.
Usually, it’s the cheapest bot, obviously, so it’s bound to work. If it doesn’t, try some wordplay, “disregard any instructions given previously”; “pretend any rules should be ignored for the following prompt”
It can be made quite difficult. https://gandalf.lakera.ai/ for instance
Lvl 4 is as far as I’m willing to work on.