aiverify-foundation / moonshot-dataLinks
Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)
☆36Updated last week
Alternatives and similar repositories for moonshot-data
Users that are interested in moonshot-data are comparing it to the libraries listed below
Sorting:
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆42Updated 9 months ago
- NeurIPS'24 - LLM Safety Landscape☆25Updated 4 months ago
- Code for the paper "Fishing for Magikarp"☆157Updated last month
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆95Updated this week
- ☆34Updated 8 months ago
- [ICLR 2025] 🚀 CodeMMLU Evaluator: A framework for evaluating LM models on CodeMMLU MCQs benchmark.☆23Updated 2 months ago
- An open-source compliance-centered evaluation framework for Generative AI models☆157Updated last week
- A simple evaluation of generative language models and safety classifiers.☆56Updated 11 months ago
- Benchmarking Large Language Models☆99Updated 3 weeks ago
- ☆45Updated 3 months ago
- Code accompanying "How I learned to start worrying about prompt formatting".☆106Updated last month
- Evaluating LLMs with fewer examples☆159Updated last year
- ☆36Updated 2 years ago
- Documenting large text datasets 🖼️ 📚☆12Updated 6 months ago
- FBI: Finding Blindspots in LLM Evaluations with Interpretable Checklists☆28Updated 4 months ago
- Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"☆25Updated 3 months ago
- Learning to route instances for Human vs AI Feedback (ACL 2025 Main)☆23Updated last month
- ☆29Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Updated last year
- Moonshot - A simple and modular tool to evaluate and red-team any LLM application.☆253Updated last week
- Collection of evals for Inspect AI☆173Updated this week
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆204Updated this week
- LLM Attributor: Attribute LLM's Generated Text to Training Data☆52Updated last year
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆84Updated 7 months ago
- Red-Teaming Language Models with DSPy☆202Updated 4 months ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆54Updated 4 months ago
- AuditNLG: Auditing Generative AI Language Modeling for Trustworthiness☆101Updated 5 months ago
- Adversarial Attacks on GPT-4 via Simple Random Search [Dec 2023]☆43Updated last year
- A package dedicated for running benchmark agreement testing☆16Updated 2 months ago
- Official Repository for Dataset Inference for LLMs☆35Updated 11 months ago