Independent robustness evaluation of Improving Alignment and Robustness with Short Circuiting
☆17Apr 15, 2025Updated last year
Alternatives and similar repositories for circuit-breakers-eval
Users that are interested in circuit-breakers-eval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A method for training neural networks that are provably robust to adversarial attacks. [IJCAI 2019]☆10Sep 3, 2019Updated 6 years ago
- [ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs☆13Jun 20, 2025Updated 9 months ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆70Feb 22, 2024Updated 2 years ago
- Comprehensive Assessment of Trustworthiness in Multimodal Foundation Models☆28Mar 15, 2025Updated last year
- Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting, ICLR 2025☆29Dec 4, 2025Updated 4 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆30Jun 19, 2023Updated 2 years ago
- Code and data to go with the Zhu et al. paper "An Objective for Nuanced LLM Jailbreaks"