A trivial programmatic Llama 3 jailbreak. Sorry Zuck!
☆567Jan 26, 2025Updated last year
Alternatives and similar repositories for llama3-jailbreak
Users that are interested in llama3-jailbreak are comparing it to the libraries listed below
Sorting:
- Red-Teaming Language Models with DSPy☆253Feb 13, 2025Updated last year
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆100Apr 13, 2025Updated 11 months ago
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,405Dec 10, 2024Updated last year
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Aug 3, 2024Updated last year
- ☆28Oct 22, 2024Updated last year
- ☆16May 30, 2024Updated last year
- (AAAI'25) Training-and-pormpt Free General Painterly Image Harmonization Using image-wise attention sharing☆61Dec 17, 2024Updated last year
- ☆867Jan 22, 2025Updated last year
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,161Feb 11, 2026Updated last month
- ☆15Apr 26, 2025Updated 10 months ago
- A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-be…☆3,062Apr 24, 2025Updated 10 months ago
- Official Code Repo for the paper "Learning to Play Atari in a World of Tokens" accepted at ICML, 2024☆11Jun 6, 2024Updated last year
- A fast + lightweight implementation of the GCG algorithm in PyTorch☆321May 13, 2025Updated 10 months ago
- Universal and Transferable Attacks on Aligned Language Models☆4,568Aug 2, 2024Updated last year
- MAexp is a generic platform for RL-based multi-agent exploration☆107Aug 25, 2025Updated 6 months ago
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆116Jun 13, 2024Updated last year
- Python package for measuring memorization in LLMs.☆184Jul 16, 2025Updated 8 months ago
- [CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak…☆3,605Dec 24, 2024Updated last year
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆175Apr 23, 2025Updated 10 months ago
- Independent robustness evaluation of Improving Alignment and Robustness with Short Circuiting☆18Apr 15, 2025Updated 11 months ago
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆84Dec 17, 2024Updated last year
- Large Action Model framework to develop AI Web Agents☆6,318Jan 21, 2025Updated last year
- We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20…☆345Feb 23, 2024Updated 2 years ago
- [ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language M…☆434Jan 22, 2025Updated last year
- A framework for Claude Opus to intelligently orchestrate subagents.☆4,328Jul 1, 2024Updated last year
- SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersec…☆18,796Updated this week
- Fine-tune LLM agents with online reinforcement learning☆1,249Mar 19, 2024Updated 2 years ago
- Official inference library for pre-processing of Mistral models☆868Mar 13, 2026Updated last week
- TACL 2025: Investigating Adversarial Trigger Transfer in Large Language Models☆19Aug 17, 2025Updated 7 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆989Jul 23, 2024Updated last year
- Verbosity control for AI agents☆66May 23, 2024Updated last year
- ☆446Apr 1, 2024Updated last year
- maze datasets for investigating OOD behavior of ML systems☆74Jan 19, 2026Updated 2 months ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,228Mar 6, 2026Updated 2 weeks ago
- LLM Analytics☆708Oct 19, 2024Updated last year
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆5,937Oct 28, 2025Updated 4 months ago
- AI powered one-click comprehensive docs from transcripts and text.☆1,697Feb 11, 2025Updated last year
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆92Jul 21, 2024Updated last year
- Prompt engineering, automated.☆352Apr 22, 2025Updated 10 months ago