haizelabs / llama3-jailbreakView external linksLinks
A trivial programmatic Llama 3 jailbreak. Sorry Zuck!
☆567Jan 26, 2025Updated last year
Alternatives and similar repositories for llama3-jailbreak
Users that are interested in llama3-jailbreak are comparing it to the libraries listed below
Sorting:
- Red-Teaming Language Models with DSPy☆251Feb 13, 2025Updated last year
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,404Dec 10, 2024Updated last year
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆100Apr 13, 2025Updated 10 months ago
- (AAAI'25) Training-and-pormpt Free General Painterly Image Harmonization Using image-wise attention sharing☆60Dec 17, 2024Updated last year
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Aug 3, 2024Updated last year
- ☆866Jan 22, 2025Updated last year
- A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-be…☆3,053Apr 24, 2025Updated 9 months ago
- MAexp is a generic platform for RL-based multi-agent exploration☆104Aug 25, 2025Updated 5 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,155Updated this week
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆116Jun 13, 2024Updated last year
- ☆16May 30, 2024Updated last year
- Large Action Model framework to develop AI Web Agents☆6,295Jan 21, 2025Updated last year
- Reaching LLaMA2 Performance with 0.1M Dollars☆986Jul 23, 2024Updated last year
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆92Jul 21, 2024Updated last year
- ☆27Oct 22, 2024Updated last year
- SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersec…☆18,478Feb 10, 2026Updated last week
- Official inference library for pre-processing of Mistral models☆854Updated this week
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆5,834Oct 28, 2025Updated 3 months ago
- Universal and Transferable Attacks on Aligned Language Models☆4,493Aug 2, 2024Updated last year
- A framework for Claude Opus to intelligently orchestrate subagents.☆4,317Jul 1, 2024Updated last year
- ☆444Apr 1, 2024Updated last year
- Fine-tune LLM agents with online reinforcement learning☆1,246Mar 19, 2024Updated last year
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,155Feb 8, 2026Updated last week
- [CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak…☆3,555Dec 24, 2024Updated last year
- AI powered one-click comprehensive docs from transcripts and text.☆1,694Feb 11, 2025Updated last year
- ☆15Apr 26, 2025Updated 9 months ago
- Official Code Repo for the paper "Learning to Play Atari in a World of Tokens" accepted at ICML, 2024☆11Jun 6, 2024Updated last year
- Verbosity control for AI agents☆66May 23, 2024Updated last year
- PyTorch native post-training library☆5,679Updated this week
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆270Jan 10, 2026Updated last month
- ☆3,069Nov 21, 2025Updated 2 months ago
- WIP - Allows you to create DSPy pipelines using ComfyUI☆204Dec 1, 2024Updated last year
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆94Nov 17, 2024Updated last year
- Python package for measuring memorization in LLMs.☆184Jul 16, 2025Updated 7 months ago
- A fast + lightweight implementation of the GCG algorithm in PyTorch☆317May 13, 2025Updated 9 months ago
- A JAX research toolkit for building, editing, and visualizing neural networks.☆1,863Jun 22, 2025Updated 7 months ago
- ☆274Aug 6, 2024Updated last year
- DSPy: The framework for programming—not prompting—language models☆32,156Updated this week
- Go ahead and axolotl questions☆11,289Updated this week