chziakas / redeval
A library for red-teaming LLM applications with LLMs.
☆25Updated 5 months ago
Alternatives and similar repositories for redeval:
Users that are interested in redeval are comparing it to the libraries listed below
- Red-Teaming Language Models with DSPy☆175Updated last month
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆41Updated 6 months ago
- Sphynx Hallucination Induction☆53Updated 2 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆89Updated 9 months ago
- LLM Evals for Text Summarization and RAG use-cases.☆35Updated last year
- ☆22Updated 5 months ago
- Whispers in the Machine: Confidentiality in LLM-integrated Systems☆35Updated 3 weeks ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆108Updated last year
- Test LLMs against jailbreaks and unprecedented harms☆26Updated 5 months ago
- A prompt injection game to collect data for robust ML research☆54Updated 2 months ago
- Code to break Llama Guard☆31Updated last year
- ☆53Updated 6 months ago
- ☆16Updated 10 months ago
- ☆64Updated 2 months ago
- ☆31Updated 4 months ago
- This project investigates the security of large language models by performing binary classification of a set of input prompts to discover…☆38Updated last year
- Track the progress of LLM context utilisation☆54Updated 8 months ago
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique☆14Updated 7 months ago
- Papers about red teaming LLMs and Multimodal models.☆105Updated 4 months ago
- Dataset for the Tensor Trust project☆39Updated last year
- Improving Alignment and Robustness with Circuit Breakers☆192Updated 6 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 8 months ago
- ☆87Updated last month
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 3 weeks ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆59Updated last year
- ☆48Updated 4 months ago
- ☆31Updated last week
- Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"☆16Updated this week
- ☆50Updated 4 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆98Updated last year