🛡️New Jailbreak Defenses for LLMs: By harnessing semantic-preserving transformations with randomized smoothing, we have enabled LLMs to defend jailbreaks with minimal impact on their performance for benign tasks. An amazing collaboration between students at UCSB and UPenn.
🛡️New Jailbreak Defenses for LLMs: By harnessing semantic-preserving transformations with randomized smoothing, we have enabled LLMs to defend jailbreaks with minimal impact on their performance for benign tasks. An amazing collaboration between students at UCSB and UPenn.
1
1
12
3K
0
Download Image