aiverify-foundation / moonshot-dataLinks
Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)
☆36Updated 2 weeks ago
Alternatives and similar repositories for moonshot-data
Users that are interested in moonshot-data are comparing it to the libraries listed below
Sorting:
- ☆34Updated 10 months ago
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆104Updated this week
- The Granite Guardian models are designed to detect risks in prompts and responses.☆115Updated this week
- ☆43Updated last year
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆45Updated last year
- A simple evaluation of generative language models and safety classifiers.☆64Updated this week
- Moonshot - A simple and modular tool to evaluate and red-team any LLM application.☆270Updated 2 weeks ago
- Code for the paper "Fishing for Magikarp"☆165Updated 4 months ago
- AI Verify☆32Updated this week
- ☆45Updated last year
- Collection of evals for Inspect AI☆233Updated this week
- ☆140Updated 3 years ago
- Evaluating LLMs with fewer examples☆161Updated last year
- AuditNLG: Auditing Generative AI Language Modeling for Trustworthiness☆102Updated 7 months ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆165Updated last year
- Code accompanying "How I learned to start worrying about prompt formatting".☆111Updated 3 months ago
- Red-Teaming Language Models with DSPy☆213Updated 7 months ago
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆70Updated last month
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique☆18Updated last year
- Official Repo for CRMArena and CRMArena-Pro☆116Updated 2 months ago
- A Comprehensive Assessment of Trustworthiness in GPT Models☆303Updated last year
- ☆81Updated 2 weeks ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆55Updated 6 months ago
- Papers about red teaming LLMs and Multimodal models.☆139Updated 3 months ago
- Open Implementations of LLM Analyses☆107Updated 11 months ago
- ☆57Updated last month
- ☆249Updated 5 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Updated last year
- ☆39Updated 2 years ago
- Improving Alignment and Robustness with Circuit Breakers☆233Updated 11 months ago