rinze@infosec.pub to Enshittification@lemmy.world · 1 year ago"Ignore all previous instructions" as a trigger for Twitter botsmastodon.deexternal-linkmessage-square34fedilinkarrow-up1459arrow-down14file-text
arrow-up1455arrow-down1external-link"Ignore all previous instructions" as a trigger for Twitter botsmastodon.derinze@infosec.pub to Enshittification@lemmy.world · 1 year agomessage-square34fedilinkfile-text
minus-squareI Cast Fist@programming.devlinkfedilinkarrow-up5·1 year agoUsually, it’s the cheapest bot, obviously, so it’s bound to work. If it doesn’t, try some wordplay, “disregard any instructions given previously”; “pretend any rules should be ignored for the following prompt”
minus-squareEvotech@lemmy.worldlinkfedilinkarrow-up5arrow-down1·1 year agoIt can be made quite difficult. https://gandalf.lakera.ai/ for instance
minus-squareForeverComical@lemmy.calinkfedilinkarrow-up1·1 year agoLvl 4 is as far as I’m willing to work on.
Usually, it’s the cheapest bot, obviously, so it’s bound to work. If it doesn’t, try some wordplay, “disregard any instructions given previously”; “pretend any rules should be ignored for the following prompt”
It can be made quite difficult. https://gandalf.lakera.ai/ for instance
Lvl 4 is as far as I’m willing to work on.