aiverify-foundation / moonshot-dataLinks

Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)

☆38

Alternatives and similar repositories for moonshot-data

Users that are interested in moonshot-data are comparing it to the libraries listed below

Sorting:

Babelscape / ALERT
Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"
☆49Updated last year
mlcommons / modelbench
Run safety benchmarks against AI models and view detailed reports showing how well they performed.
☆111Updated this week
patronus-ai / Lynx-hallucination-detection
☆43Updated last year
RapidResponseBench / rapidresponsebench
☆35Updated last year
ibm-granite / granite-guardian
The Granite Guardian models are designed to detect risks in prompts and responses.
☆121Updated last month
haizelabs / dspy-redteam
Red-Teaming Language Models with DSPy
☆238Updated 9 months ago
aiverify-foundation / moonshot
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
☆285Updated 2 months ago
FSoft-AI4Code / CodeMMLU
[ICLR 2025] 🚀 CodeMMLU Evaluator: A framework for evaluating LM models on CodeMMLU MCQs benchmark.
☆28Updated 7 months ago
cohere-ai / magikarp
Code for the paper "Fishing for Magikarp"
☆175Updated 6 months ago
UKGovernmentBEIS / inspect_evals
Collection of evals for Inspect AI
☆290Updated this week
felipemaiapolo / tinyBenchmarks
Evaluating LLMs with fewer examples
☆168Updated last year
Libr-AI / do-not-answer
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
☆296Updated last year
amazon-science / controlling-llm-memorization
☆38Updated 2 years ago
allenai / safety-eval
A simple evaluation of generative language models and safety classifiers.
☆79Updated last month
microsoft / llm-steer-instruct
A method for steering llms to better follow instructions
☆62Updated 3 months ago
LLM360 / Analysis360
Open Implementations of LLM Analyses
☆107Updated last year
compl-ai / compl-ai
An open-source compliance-centered evaluation framework for Generative AI models
☆172Updated 2 weeks ago
Libr-AI / OpenRedTeaming
Papers about red teaming LLMs and Multimodal models.
☆156Updated 6 months ago
declare-lab / ferret
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
☆18Updated last year
SalesforceAIResearch / CRMArena
Official Repo for CRMArena and CRMArena-Pro
☆126Updated 3 weeks ago
allenai / infinigram-api
☆87Updated this week
haizelabs / redteaming-resistance-benchmark
☆49Updated last year
IBM / ensemble-instruct
codebase release for EMNLP2023 paper publication
☆19Updated 2 months ago
centerforaisafety / wmdp
WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning m…
☆155Updated 6 months ago
HazyResearch / aioli
Aioli: A unified optimization framework for language model data mixing
☆31Updated 10 months ago
msclar / formatspread
Code accompanying "How I learned to start worrying about prompt formatting".
☆112Updated 5 months ago
aiverify-foundation / aiverify
AI Verify
☆37Updated 3 weeks ago
facebookresearch / SecAlign
Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"
☆75Updated 4 months ago
salesforce / summary-of-a-haystack
Codebase accompanying the Summary of a Haystack paper.
☆79Updated last year
microsoft / BIPIA
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
☆89Updated last year