aiverify-foundation / moonshot
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
☆180Updated this week
Related projects ⓘ
Alternatives and complementary repositories for moonshot
- AI Verify☆123Updated this week
- Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)☆18Updated this week
- Red-Teaming Language Models with DSPy☆142Updated 7 months ago
- ☆34Updated 3 months ago
- Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs☆184Updated 5 months ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]☆220Updated 2 months ago
- A trace analysis tool for AI agents.☆124Updated last month
- This repository provides implementation to formalize and benchmark Prompt Injection attacks and defenses☆146Updated 2 months ago
- HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal☆342Updated 3 months ago
- A Comprehensive Assessment of Trustworthiness in GPT Models☆261Updated 2 months ago
- Persuasive Jailbreaker: we can persuade LLMs to jailbreak them!☆259Updated last month
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…☆313Updated 8 months ago
- A benchmark for prompt injection detection systems.☆87Updated 2 months ago
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models☆471Updated 4 months ago
- Automatically evaluate your LLMs in Google Colab☆559Updated 6 months ago
- A collection of automated evaluators for assessing jailbreak attempts.☆75Updated 4 months ago
- ☆61Updated last month
- Fiddler Auditor is a tool to evaluate language models.☆171Updated 8 months ago
- LLM security and privacy☆41Updated last month
- TAP: An automated jailbreaking method for black-box LLMs☆119Updated 8 months ago
- Attribute (or cite) statements generated by LLMs back to in-context information.☆148Updated last month
- ⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs☆315Updated 9 months ago
- ☆63Updated this week
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆323Updated last month
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆107Updated 8 months ago
- Papers about red teaming LLMs and Multimodal models.☆78Updated this week
- Official Repo of ACL 2024 Paper `ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs`☆45Updated 3 weeks ago
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]☆104Updated last month
- Sample notebooks and prompts for LLM evaluation☆114Updated last week
- AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks☆28Updated 5 months ago