haizelabs / llama3-jailbreak
A trivial programmatic Llama 3 jailbreak. Sorry Zuck!
☆549Updated 3 months ago
Alternatives and similar repositories for llama3-jailbreak
Users that are interested in llama3-jailbreak are comparing it to the libraries listed below
Sorting:
- A benchmark for emotional intelligence in large language models☆289Updated 9 months ago
- Persuasive Jailbreaker: we can persuade LLMs to jailbreak them!☆299Updated 7 months ago
- Red-Teaming Language Models with DSPy☆192Updated 3 months ago
- ☆401Updated 9 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆223Updated last year
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆235Updated 3 months ago
- ☆444Updated last year
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆439Updated 7 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆173Updated last year
- A library for making RepE control vectors☆587Updated 4 months ago
- Simple Python library/structure to ablate features in LLMs which are supported by TransformerLens☆462Updated 11 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆91Updated last month
- ☆288Updated last month
- Code release for Best-of-N Jailbreaking☆495Updated 3 months ago
- A multimodal, function calling powered LLM webui.☆214Updated 7 months ago
- ☆437Updated 7 months ago
- llama.cpp with BakLLaVA model describes what does it see☆383Updated last year
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆653Updated 11 months ago
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆151Updated last week
- An mlx project to train a base model on your whatsapp chats using (Q)Lora finetuning☆167Updated last year
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]☆304Updated 3 months ago
- Approximation of the Claude 3 tokenizer by inspecting generation stream☆129Updated 9 months ago
- Convenience scripts to finetune (chat-)LLaMa3 and other models for any language☆305Updated 11 months ago
- Automatically evaluate your LLMs in Google Colab☆622Updated last year
- Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".☆217Updated 7 months ago
- From anywhere you can type, query and stream the output of an LLM or any other script☆496Updated last year
- Efficient visual programming for AI language models☆361Updated 8 months ago
- NexusRaven-13B, a new SOTA Open-Source LLM for function calling. This repo contains everything for reproducing our evaluation on NexusRav…☆315Updated last year
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"☆985Updated 3 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆238Updated 11 months ago