aiverify-foundation / moonshotLinks

Moonshot - A simple and modular tool to evaluate and red-team any LLM application.

☆285

Alternatives and similar repositories for moonshot

Users that are interested in moonshot are comparing it to the libraries listed below

Sorting:

aiverify-foundation / aiverify
AI Verify
☆37Updated 3 weeks ago
haizelabs / redteaming-resistance-benchmark
☆49Updated last year
agencyenterprise / PromptInject
PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…
☆437Updated last year
haizelabs / dspy-redteam
Red-Teaming Language Models with DSPy
☆238Updated 9 months ago
ethz-spylab / agentdojo
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
☆357Updated last month
Libr-AI / do-not-answer
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
☆296Updated last year
UKGovernmentBEIS / inspect_evals
Collection of evals for Inspect AI
☆289Updated last week
Giskard-AI / awesome-ai-safety
📚 A curated list of papers & technical articles on AI Quality & Safety
☆193Updated 7 months ago
invariantlabs-ai / invariant
Guardrails for secure and robust agent development
☆366Updated 4 months ago
ibm-granite / granite-guardian
The Granite Guardian models are designed to detect risks in prompts and responses.
☆121Updated last month
PAIR-code / llm-comparator
LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…
☆499Updated 9 months ago
aiverify-foundation / moonshot-data
Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)
☆38Updated 2 months ago
fiddler-labs / fiddler-auditor
Fiddler Auditor is a tool to evaluate language models.
☆188Updated last year
potsawee / selfcheckgpt
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
☆578Updated last year
AI-secure / DecodingTrust
A Comprehensive Assessment of Trustworthiness in GPT Models
☆308Updated last year
CHATS-lab / persuasive_jailbreaker
Persuasive Jailbreaker: we can persuade LLMs to jailbreak them!
☆334Updated last month
compl-ai / compl-ai
An open-source compliance-centered evaluation framework for Generative AI models
☆172Updated 2 weeks ago
Libr-AI / OpenRedTeaming
Papers about red teaming LLMs and Multimodal models.
☆156Updated 6 months ago
liu00222 / Open-Prompt-Injection
This repository provides a benchmark for prompt injection attacks and defenses
☆346Updated last month
MadryLab / context-cite
Attribute (or cite) statements generated by LLMs back to in-context information.
☆300Updated last year
centerforaisafety / HarmBench
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
☆789Updated last year
mlcommons / modelbench
Run safety benchmarks against AI models and view detailed reports showing how well they performed.
☆111Updated last week
ZenGuard-AI / fast-llm-security-guardrails
The fastest Trust Layer for AI Agents
☆145Updated 6 months ago
HowieHwong / TrustLLM
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
☆613Updated 5 months ago
aiverify-foundation / LLM-Evals-Catalogue
This repository stems from our paper, “Cataloguing LLM Evaluations”, and serves as a living, collaborative catalogue of LLM evaluation fr…
☆18Updated 2 years ago
mlabonne / llm-autoeval
Automatically evaluate your LLMs in Google Colab
☆671Updated last year
ServiceNow / TapeAgents
TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle
☆298Updated 3 weeks ago
arthur-ai / bench
A tool for evaluating LLMs
☆428Updated last year
tml-epfl / llm-adaptive-attacks
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
☆364Updated 10 months ago
argilla-io / synthetic-data-generator
Build datasets using natural language
☆547Updated 2 months ago