M0gician / RaccoonBenchLinks
[ACL 2024] Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications
☆14Updated last year
Alternatives and similar repositories for RaccoonBench
Users that are interested in RaccoonBench are comparing it to the libraries listed below
Sorting:
- ☆106Updated 4 months ago
- BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).☆155Updated last year
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆81Updated last year
- Mostly recording papers about models' trustworthy applications. Intending to include topics like model evaluation & analysis, security, c…☆21Updated 2 years ago
- A lightweight library for large laguage model (LLM) jailbreaking defense.☆55Updated 10 months ago
- Code for paper "Defending aginast LLM Jailbreaking via Backtranslation"☆30Updated last year
- Awesome LLM Jailbreak academic papers☆105Updated last year
- ☆25Updated 10 months ago
- ☆52Updated 3 weeks ago
- A toolkit to assess data privacy in LLMs (under development)☆62Updated 8 months ago
- ☆182Updated last year
- ☆51Updated last year
- Code for ACL 2024 paper: PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models.☆13Updated 7 months ago
- [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆48Updated last month
- Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873☆163Updated last year
- Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …☆70Updated this week
- [USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models☆183Updated 6 months ago
- LLM Unlearning☆174Updated last year
- Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"☆95Updated 3 months ago
- Code and data of the EMNLP 2022 paper "Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversaria…☆54Updated 2 years ago
- ☆13Updated 3 months ago
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆83Updated 5 months ago
- The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…☆62Updated 7 months ago
- Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)☆147Updated 9 months ago
- ☆41Updated 5 months ago
- Fast Memorization of Prompt Improves Context Awareness of Large Language Models (Findings of EMNLP 2024)☆22Updated 10 months ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆158Updated 5 months ago
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆87Updated 3 months ago
- [COLING 2025] Official repo of paper: "Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jail…☆13Updated last year
- [ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models☆37Updated last year