☆50Aug 3, 2024Updated last year
Alternatives and similar repositories for redteaming-resistance-benchmark
Users that are interested in redteaming-resistance-benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Fancy upgrade to console.log☆21May 20, 2026Updated last week
- ☆16May 30, 2024Updated last year
- NVIDIA’s repository for enabling trustworthy AI.☆37May 22, 2026Updated last week
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆38Feb 22, 2025Updated last year
- This repo contains a demo of adversarial strings poisoning vector database and forching specific hallucinations on RAG chatbot.☆10May 2, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- A curated collection of papers and related projects on using LLMs for privacy.☆31Oct 8, 2025Updated 7 months ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- Open Source Auth Built on Freestyle: own your auth + data https://docs.freestyle.dev/guides/authentication/☆23Jun 12, 2024Updated last year
- [ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions☆14Mar 7, 2026Updated 2 months ago
- The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1☆13Apr 23, 2025Updated last year
- Code implementation of R^2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning☆22Jul 8, 2024Updated last year
- ☆10Mar 13, 2023Updated 3 years ago
- Automated Safety Testing of Large Language Models☆18Jan 31, 2025Updated last year
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆88May 14, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆48May 9, 2024Updated 2 years ago
- [NeurIPS 2023] Official repository for "Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models"☆11Jun 18, 2024Updated last year
- LLM evaluation.☆16Nov 7, 2023Updated 2 years ago
- ☆15Jun 7, 2024Updated last year
- ☆27May 20, 2025Updated last year
- A command line tool for crawling a webstite for dead links, permeant and or fatal redirects, resource load issues, and script errors. It…☆12Apr 16, 2023Updated 3 years ago
- Tree of Attacks (TAP) Jailbreaking Implementation☆120Feb 7, 2024Updated 2 years ago
- ☆34Sep 19, 2025Updated 8 months ago
- Agent Zero (agent-zero.ai) extensions for ethical penetration testing☆21Sep 10, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Persuasive Jailbreaker: we can persuade LLMs to jailbreak them!☆357Oct 17, 2025Updated 7 months ago
- MCP easy installer is a robust mcp server with tools to search, install, configure, repair and uninstall MCP servers☆17Apr 19, 2025Updated last year
- A Security Benchmark for Claude Code Agent Skills☆51May 19, 2026Updated last week
- Open Imi is a open source claude desktop alternative for developers, engineers and tech teams to hack MCP's and agents to their own likin…☆11Nov 16, 2025Updated 6 months ago
- An interactive CLI application for interacting with authenticated Jupyter instances.☆56May 7, 2025Updated last year
- Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages." Paper accepted at Findings of EMNLP 2024☆18Mar 25, 2025Updated last year
- [EMNLP 2025 Findings] Familiarity-aware Evidence Compression for Retrieval Augmented Generation☆15Aug 20, 2025Updated 9 months ago
- Codes and datasets of the paper Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment☆111Mar 8, 2024Updated 2 years ago
- HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal☆953Aug 16, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code for paper [Explaining image classifiers by removing input features using generative models] [ACCV 2020] https://arxiv.org/abs/1910.0…☆15Nov 22, 2022Updated 3 years ago
- Code to reproduce experiments from the EMNLP 2015 paper about Rumour Stance Classification with Gaussian Processes.☆37May 23, 2016Updated 10 years ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆175Mar 8, 2025Updated last year
- ☆14Mar 23, 2023Updated 3 years ago
- Sync MCP (Model Context Protocol) configurations across AI tools☆45Jun 20, 2025Updated 11 months ago
- Embed any reddit post onto your website!☆24Jun 11, 2021Updated 4 years ago
- Dynamic Numpy arrays☆13Feb 26, 2017Updated 9 years ago