aiverify-foundation / moonshot-dataLinks
Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)
☆37Updated 2 months ago
Alternatives and similar repositories for moonshot-data
Users that are interested in moonshot-data are comparing it to the libraries listed below
Sorting:
- ☆35Updated 11 months ago
- ☆43Updated last year
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆108Updated this week
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆49Updated last year
- Code for the paper "Fishing for Magikarp"☆173Updated 5 months ago
- The Granite Guardian models are designed to detect risks in prompts and responses.☆119Updated last month
- Open Implementations of LLM Analyses☆107Updated last year
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique☆18Updated last year
- Documenting large text datasets 🖼️ 📚☆14Updated 10 months ago
- AuditNLG: Auditing Generative AI Language Modeling for Trustworthiness☆101Updated 9 months ago
- codebase release for EMNLP2023 paper publication☆19Updated last month
- Evaluating LLMs with fewer examples☆165Updated last year
- ☆38Updated 2 years ago
- Red-Teaming Language Models with DSPy☆235Updated 8 months ago
- Collection of evals for Inspect AI☆280Updated this week
- FBI: Finding Blindspots in LLM Evaluations with Interpretable Checklists☆30Updated 2 months ago
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆92Updated 11 months ago
- A simple evaluation of generative language models and safety classifiers.☆72Updated 2 weeks ago
- Moonshot - A simple and modular tool to evaluate and red-team any LLM application.☆281Updated 2 months ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆57Updated 7 months ago
- Reward Model framework for LLM RLHF☆61Updated 2 years ago
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆211Updated last week
- Improving Alignment and Robustness with Circuit Breakers☆240Updated last year
- Code accompanying "How I learned to start worrying about prompt formatting".☆110Updated 5 months ago
- Aioli: A unified optimization framework for language model data mixing☆28Updated 9 months ago
- ☆47Updated 7 months ago
- ☆53Updated last year
- ☆80Updated this week
- autoredteam: code for training models that automatically red team other language models☆13Updated 2 years ago
- WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning m…☆152Updated 5 months ago