☆50Aug 3, 2024Updated last year
Alternatives and similar repositories for redteaming-resistance-benchmark
Users that are interested in redteaming-resistance-benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLM red teaming datasets from the paper 'Student-Teacher Prompting for Red Teaming to Improve Guardrails' for the ART of Safety Workshop …☆24Oct 12, 2023Updated 2 years ago
- Fancy upgrade to console.log☆21Jun 1, 2022Updated 3 years ago
- NVIDIA’s repository for enabling trustworthy AI.☆28Mar 3, 2026Updated 3 weeks ago
- ☆16May 30, 2024Updated last year
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆37Feb 22, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- autoredteam: code for training models that automatically red team other language models☆15Aug 9, 2023Updated 2 years ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- A Text2Speech Engine built in Pytorch.☆12Dec 9, 2018Updated 7 years ago
- ☆36May 23, 2023Updated 2 years ago
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Aug 3, 2024Updated last year
- The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1☆13Apr 23, 2025Updated 11 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆99Apr 13, 2025Updated 11 months ago
- ☆14Dec 19, 2024Updated last year
- EmojiNotion – project for CBMI 2025 conference☆14Jun 6, 2025Updated 9 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆10Mar 13, 2023Updated 3 years ago
- Automated Safety Testing of Large Language Models☆18Jan 31, 2025Updated last year
- Official Code Release for "Training a Generally Curious Agent"☆45May 18, 2025Updated 10 months ago
- Medium API Python SDK☆15Dec 8, 2022Updated 3 years ago
- ☆15Jun 7, 2024Updated last year
- ☆27May 20, 2025Updated 10 months ago
- A command line tool for crawling a webstite for dead links, permeant and or fatal redirects, resource load issues, and script errors. It…☆12Apr 16, 2023Updated 2 years ago
- Tree of Attacks (TAP) Jailbreaking Implementation☆118Feb 7, 2024Updated 2 years ago
- ☆33Sep 19, 2025Updated 6 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Run all the tests at the same time with modal.com☆11Mar 2, 2024Updated 2 years ago
- AutoEDA: An Automated Exploratory Data Analysis (EDA) Toolkit Simplify and automate your data exploration process with AutoEDA. This ope…☆20Nov 11, 2023Updated 2 years ago
- Official Implementation of Harnessing Perceptual Adversarial Patches for Crowd Counting (ACM CCS)☆18Apr 28, 2023Updated 2 years ago
- The website of the Public AI Network☆20Mar 12, 2026Updated 2 weeks ago
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆27Mar 9, 2026Updated 3 weeks ago
- ATLAS tactics, techniques, and case studies data☆120Feb 6, 2026Updated last month
- ☆56Mar 17, 2026Updated last week
- PyTorch Implementation of the Deep k-Nearest-Neighbors algorithm, https://arxiv.org/abs/1803.04765☆16Aug 18, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages." Paper accepted at Findings of EMNLP 2024☆18Mar 25, 2025Updated last year
- HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal☆890Aug 16, 2024Updated last year
- Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs☆33Dec 9, 2025Updated 3 months ago
- PyTorch implementation of Expectation over Transformation☆13Jul 18, 2025Updated 8 months ago
- A lightweight and minimal csv database package with SQL-like syntax☆12Jan 26, 2025Updated last year
- Code to reproduce experiments from the EMNLP 2015 paper about Rumour Stance Classification with Gaussian Processes.☆37May 23, 2016Updated 9 years ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆172Mar 8, 2025Updated last year