☆34Nov 12, 2024Updated last year
Alternatives and similar repositories for rapidresponsebench
Users that are interested in rapidresponsebench are comparing it to the libraries listed below
Sorting:
- ☆25Sep 3, 2025Updated 5 months ago
- Code for the API, workload execution, and agents underlying the LLMail-Inject Adpative Prompt Injection Challenge☆19Updated this week
- ☆13Sep 12, 2024Updated last year
- Code Repository for Blog - How to Productionize Large Language Models (LLMs)☆12Mar 27, 2024Updated last year
- ☆31Sep 23, 2024Updated last year
- ☆35May 21, 2025Updated 9 months ago
- Material for the series of seminars on Large Language Models☆34Apr 21, 2024Updated last year
- Code for the paper "Evading Black-box Classifiers Without Breaking Eggs" [SaTML 2024]☆21Apr 15, 2024Updated last year
- ☆18Apr 15, 2024Updated last year
- Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]☆78Jan 23, 2025Updated last year
- ☆20Apr 7, 2024Updated last year
- The most comprehensive and accurate LLM jailbreak attack benchmark by far☆22Mar 22, 2025Updated 11 months ago
- [CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment☆27Jun 11, 2025Updated 8 months ago
- This repository contains the implementation of evaluation metrics for recommendation systems. We have compared similarity, candidate gene …☆27Feb 21, 2025Updated last year
- Datastructure for data science☆23Apr 12, 2024Updated last year
- CVPR'19 experiments with (on-manifold) adversarial examples.☆43Feb 27, 2020Updated 6 years ago
- Improving Alignment and Robustness with Circuit Breakers☆258Sep 24, 2024Updated last year
- Guided Adversarial Attack for Evaluating and Enhancing Adversarial Defenses, NeurIPS Spotlight 2020☆27Dec 23, 2020Updated 5 years ago
- ☆67Mar 30, 2025Updated 11 months ago
- 🤓 A collection of AWESOME structured summaries of Large Language Models (LLMs)☆31Sep 7, 2023Updated 2 years ago
- ☆34Jan 25, 2024Updated 2 years ago
- ☆196Nov 26, 2023Updated 2 years ago
- Runtime protection for AI agents☆105Feb 23, 2026Updated last week
- ☆28Apr 3, 2025Updated 10 months ago
- [AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data☆33Apr 7, 2025Updated 10 months ago
- A fast + lightweight implementation of the GCG algorithm in PyTorch☆318May 13, 2025Updated 9 months ago
- ☆30Dec 6, 2024Updated last year
- Auditing agents for fine-tuning safety☆20Oct 21, 2025Updated 4 months ago
- Panda Guard is designed for researching jailbreak attacks, defenses, and evaluation algorithms for large language models (LLMs).☆62Jan 19, 2026Updated last month
- Jailbreak artifacts for JailbreakBench☆80Nov 6, 2024Updated last year
- A better way of testing, inspecting, and analyzing AI Agent traces.☆48Jan 12, 2026Updated last month
- Conversational agents for engineering simulations with minimal human input using Microsoft AutoGen & GPT-4o.☆41Aug 4, 2024Updated last year
- Open Source Replication of Anthropic's Alignment Faking Paper☆54Apr 4, 2025Updated 10 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- Neural theorem proving tutorial, version II☆40Apr 26, 2024Updated last year
- First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and saf…☆50Dec 3, 2025Updated 2 months ago
- code pattern and instructions to deploy intelligent loan web app☆11Sep 17, 2025Updated 5 months ago
- Intelligent Document Processing with AWS AI/ML, published by Packt☆12Feb 5, 2026Updated 3 weeks ago
- Provable Robustness of ReLU networks via Maximization of Linear Regions [AISTATS 2019]☆31Jul 15, 2020Updated 5 years ago