XuandongZhao / weak-to-strong

Weak-to-Strong Jailbreaking on Large Language Models
62Updated 6 months ago

Related projects: