leondz / lm_risk_cards
Risks and targets for assessing LLMs & LLM vulnerabilities
☆30Updated 10 months ago
Alternatives and similar repositories for lm_risk_cards:
Users that are interested in lm_risk_cards are comparing it to the libraries listed below
- A collection of prompt injection mitigation techniques.☆22Updated last year
- LLM security and privacy☆48Updated 6 months ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆109Updated last year
- Whispers in the Machine: Confidentiality in LLM-integrated Systems☆35Updated last month
- A benchmark for prompt injection detection systems.☆100Updated 2 months ago
- ATLAS tactics, techniques, and case studies data☆63Updated last month
- Explore AI Supply Chain Risk with the AI Risk Database☆53Updated 11 months ago
- ☆93Updated last month
- Universal Robustness Evaluation Toolkit (for Evasion)☆32Updated last year
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆82Updated 11 months ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆130Updated 3 weeks ago
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆50Updated 8 months ago
- This repository provides a benchmark for prompt Injection attacks and defenses☆188Updated last week
- ☆31Updated 5 months ago
- Secure Jupyter Notebooks and Experimentation Environment☆74Updated 2 months ago
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆45Updated 6 months ago
- Papers about red teaming LLMs and Multimodal models.☆111Updated 5 months ago
- Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the Over…☆12Updated last year
- Project LLM Verification Standard☆43Updated last year
- A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.☆66Updated last year
- An Execution Isolation Architecture for LLM-Based Agentic Systems☆70Updated 2 months ago
- ☆127Updated 5 months ago
- Red-Teaming Language Models with DSPy☆183Updated 2 months ago
- Adversarial Attacks on GPT-4 via Simple Random Search [Dec 2023]☆43Updated 11 months ago
- ☆59Updated 9 months ago
- Payloads for Attacking Large Language Models☆79Updated 9 months ago
- Code to break Llama Guard☆31Updated last year
- Dropbox LLM Security research code and results☆222Updated 11 months ago
- ☆42Updated 8 months ago
- A prompt injection game to collect data for robust ML research☆55Updated 2 months ago