Algorithmic-Alignment-Lab / CommonClaim

Explore, Establish, Exploit: Red Teaming Language Models from Scratch
11Updated last year

Related projects

Alternatives and complementary repositories for CommonClaim