leondz / lm_risk_cardsLinks

Risks and targets for assessing LLMs & LLM vulnerabilities

☆32

Alternatives and similar repositories for lm_risk_cards

Users that are interested in lm_risk_cards are comparing it to the libraries listed below

Sorting:

lve-org / lve
A repository of Language Model Vulnerabilities and Exposures (LVEs).
☆112Updated last year
Valhall-ai / prompt-injection-mitigations
A collection of prompt injection mitigation techniques.
☆24Updated 2 years ago
lakeraai / pint-benchmark
A benchmark for prompt injection detection systems.
☆148Updated 2 months ago
vinusankars / BEAST
Implementation of BEAST adversarial attack for language models (ICML 2024)
☆91Updated last year
microsoft / BIPIA
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
☆88Updated last year
briland / LLM-security-and-privacy
LLM security and privacy
☆51Updated last year
google-research / camel-prompt-injection
Code for the paper "Defeating Prompt Injections by Design"
☆150Updated 5 months ago
microsoft / TaskTracker
TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…
☆74Updated 2 months ago
mitre-atlas / atlas-data
ATLAS tactics, techniques, and case studies data
☆85Updated last week
ethz-spylab / agentdojo
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
☆348Updated 3 weeks ago
andyzorigin / cybench
☆168Updated 5 months ago
deadbits / vigil-llm
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
☆426Updated last year
dropbox / llm-security
Dropbox LLM Security research code and results
☆243Updated last year
pasquini-dario / LLMmap
☆81Updated 3 months ago
agencyenterprise / PromptInject
PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…
☆433Updated last year
liu00222 / Open-Prompt-Injection
This repository provides a benchmark for prompt injection attacks and defenses
☆333Updated 3 weeks ago
IBM / URET
Universal Robustness Evaluation Toolkit (for Evasion)
☆31Updated 2 months ago
LostOxygen / llm-confidentiality
Whispers in the Machine: Confidentiality in Agentic Systems
☆41Updated last week
precize / Agentic-AI-Top10-Vulnerability
Top 10 for Agentic AI (AI Agent Security) serves as the core for OWASP and CSA Red teaming work
☆151Updated last month
mnns / LLMFuzzer
🧠 LLMFuzzer - Fuzzing Framework for Large Language Models 🧠 LLMFuzzer is the first open-source fuzzing framework specifically designed …
☆327Updated last year
parameterlab / trap
Source code of "TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification", ACL2024 (findings)
☆13Updated last year
kenhuangus / Top-Threats-for-AI-Agents
☆55Updated 6 months ago
protectai / nbdefense
Secure Jupyter Notebooks and Experimentation Environment
☆84Updated 9 months ago
Libr-AI / OpenRedTeaming
Papers about red teaming LLMs and Multimodal models.
☆154Updated 5 months ago
Reapor-Yurnero / imprompter
Codebase of https://arxiv.org/abs/2410.14923
☆52Updated last year
uiuc-kang-lab / InjecAgent
☆92Updated last year
sunblaze-ucb / cybergym
CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…
☆90Updated last month
mitre-atlas / ai-risk-database
Explore AI Supply Chain Risk with the AI Risk Database
☆63Updated last year
chawins / pal
PAL: Proxy-Guided Black-Box Attack on Large Language Models
☆55Updated last year
leondz / autoredteam
autoredteam: code for training models that automatically red team other language models
☆13Updated 2 years ago