chziakas / redeval
A library for red-teaming LLM applications with LLMs.
☆21Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for redeval
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆32Updated last month
- Red-Teaming Language Models with DSPy☆142Updated 7 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 4 months ago
- ☆127Updated last month
- LLM Evals for Text Summarization and RAG use-cases.☆35Updated 9 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆84Updated 8 months ago
- The official repository of the paper "On the Exploitability of Instruction Tuning".☆57Updated 9 months ago
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆52Updated last month
- Open Implementations of LLM Analyses☆94Updated last month
- Mixing Language Models with Self-Verification and Meta-Verification☆97Updated last year
- ☆24Updated last year
- ☆38Updated this week
- Using open source LLMs to build synthetic datasets for direct preference optimization☆40Updated 8 months ago
- Codebase accompanying the Summary of a Haystack paper.☆71Updated last month
- ☆49Updated 2 weeks ago
- Dataset for the Tensor Trust project☆32Updated 7 months ago
- Sphynx Hallucination Induction☆47Updated 3 months ago
- CodeSage: Code Representation Learning At Scale (ICLR 2024)☆82Updated 2 weeks ago
- ☆18Updated 2 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆34Updated 3 weeks ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆55Updated 3 months ago
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; arXiv preprint arXiv:2403.…☆36Updated 3 months ago
- ☆30Updated last month
- ☆44Updated last month
- ☆102Updated 2 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆38Updated 2 weeks ago
- ☆57Updated last week
- Improving Alignment and Robustness with Circuit Breakers☆152Updated last month
- ☆42Updated 4 months ago