aiverify-foundation / moonshot-dataLinks
Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)
☆38Updated 2 months ago
Alternatives and similar repositories for moonshot-data
Users that are interested in moonshot-data are comparing it to the libraries listed below
Sorting:
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆49Updated last year
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆111Updated this week
- ☆43Updated last year
- ☆35Updated last year
- The Granite Guardian models are designed to detect risks in prompts and responses.☆121Updated last month
- Red-Teaming Language Models with DSPy☆238Updated 9 months ago
- Moonshot - A simple and modular tool to evaluate and red-team any LLM application.☆285Updated 2 months ago
- [ICLR 2025] 🚀 CodeMMLU Evaluator: A framework for evaluating LM models on CodeMMLU MCQs benchmark.☆28Updated 7 months ago
- Code for the paper "Fishing for Magikarp"☆175Updated 6 months ago
- Collection of evals for Inspect AI☆290Updated this week
- Evaluating LLMs with fewer examples☆168Updated last year
- Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs☆296Updated last year
- ☆38Updated 2 years ago
- A simple evaluation of generative language models and safety classifiers.☆79Updated last month
- A method for steering llms to better follow instructions☆62Updated 3 months ago
- Open Implementations of LLM Analyses☆107Updated last year
- An open-source compliance-centered evaluation framework for Generative AI models☆172Updated 2 weeks ago
- Papers about red teaming LLMs and Multimodal models.☆156Updated 6 months ago
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique☆18Updated last year
- Official Repo for CRMArena and CRMArena-Pro☆126Updated 3 weeks ago
- ☆87Updated this week
- ☆49Updated last year
- codebase release for EMNLP2023 paper publication☆19Updated 2 months ago
- WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning m…☆155Updated 6 months ago
- Aioli: A unified optimization framework for language model data mixing☆31Updated 10 months ago
- Code accompanying "How I learned to start worrying about prompt formatting".☆112Updated 5 months ago
- AI Verify☆37Updated 3 weeks ago
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆75Updated 4 months ago
- Codebase accompanying the Summary of a Haystack paper.☆79Updated last year
- A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.☆89Updated last year