chziakas / redeval
A library for red-teaming LLM applications with LLMs.
☆26Updated 7 months ago
Alternatives and similar repositories for redeval
Users that are interested in redeval are comparing it to the libraries listed below
Sorting:
- Red-Teaming Language Models with DSPy☆192Updated 3 months ago
- ☆100Updated 2 months ago
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆41Updated 7 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆91Updated last month
- Sphynx Hallucination Induction☆54Updated 3 months ago
- ☆65Updated 3 months ago
- Code to break Llama Guard☆31Updated last year
- Test LLMs against jailbreaks and unprecedented harms☆29Updated 6 months ago
- This project investigates the security of large language models by performing binary classification of a set of input prompts to discover…☆39Updated last year
- The LLM Red Teaming Framework☆73Updated this week
- Realign is a testing and simulation framework for AI applications.☆16Updated 5 months ago
- A prompt injection game to collect data for robust ML research☆56Updated 3 months ago
- Code for the paper "Fishing for Magikarp"☆155Updated this week
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 10 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆102Updated last year
- Dataset for the Tensor Trust project☆40Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆60Updated 6 months ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆109Updated last year
- ☆32Updated 6 months ago
- ☆43Updated 9 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆61Updated last year
- Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"☆20Updated last month
- ☆57Updated this week
- ☆77Updated 6 months ago
- ☆22Updated 6 months ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆23Updated last month
- ☆30Updated 2 months ago
- LLM security and privacy☆49Updated 7 months ago
- Functional Benchmarks and the Reasoning Gap☆86Updated 7 months ago
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆29Updated last month