aiverify-foundation / aiverify
AI Verify
☆123Updated this week
Related projects ⓘ
Alternatives and complementary repositories for aiverify
- Moonshot - A simple and modular tool to evaluate and red-team any LLM application.☆180Updated this week
- 📚 A curated list of papers & technical articles on AI Quality & Safety☆161Updated last year
- Fiddler Auditor is a tool to evaluate language models.☆171Updated 8 months ago
- ☆34Updated 3 months ago
- Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)☆18Updated this week
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act☆92Updated last year
- A trace analysis tool for AI agents.☆124Updated last month
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆107Updated 8 months ago
- A Comprehensive Assessment of Trustworthiness in GPT Models☆261Updated 2 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆84Updated 8 months ago
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆62Updated this week
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated this week
- Inspect: A framework for large language model evaluations☆624Updated this week
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆65Updated this week
- Python SDK for running evaluations on LLM generated responses☆221Updated last week
- The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applicati…☆222Updated last month
- Red-Teaming Language Models with DSPy☆142Updated 7 months ago
- This repository stems from our paper, “Cataloguing LLM Evaluations”, and serves as a living, collaborative catalogue of LLM evaluation fr…☆14Updated last year
- A tool for evaluating LLMs☆392Updated 6 months ago
- LLM security and privacy☆41Updated last month
- A benchmark for prompt injection detection systems.☆87Updated 2 months ago
- Dataset for the Tensor Trust project☆33Updated 8 months ago
- Sample notebooks and prompts for LLM evaluation☆114Updated last week
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]☆220Updated 2 months ago
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…☆313Updated 8 months ago
- Lakera - ChatGPT Data Leak Protection☆23Updated 4 months ago
- A text embedding viewer for the Jupyter environment☆18Updated 9 months ago
- Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs☆184Updated 5 months ago
- Make your GenAI Apps Safe & Secure Test & harden your system prompt☆404Updated last month
- ☆65Updated last year