M0gician / RaccoonBenchLinks

[ACL 2024] Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications

☆14

Alternatives and similar repositories for RaccoonBench

Users that are interested in RaccoonBench are comparing it to the libraries listed below

Sorting:

OSU-NLP-Group / AgentSafety
☆106Updated 4 months ago
PKU-Alignment / beavertails
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
☆155Updated last year
thunlp / DebugBench
The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".
☆81Updated last year
Yangyi-Chen / PaperList-Trustworthy-Applications
Mostly recording papers about models' trustworthy applications. Intending to include topics like model evaluation & analysis, security, c…
☆21Updated 2 years ago
YihanWang617 / llm-jailbreaking-defense
A lightweight library for large laguage model (LLM) jailbreaking defense.
☆55Updated 10 months ago
YihanWang617 / LLM-Jailbreaking-Defense-Backtranslation
Code for paper "Defending aginast LLM Jailbreaking via Backtranslation"
☆30Updated last year
WhileBug / AwesomeLLMJailBreakPapers
Awesome LLM Jailbreak academic papers
☆105Updated last year
microsoft / ValueCompass
☆25Updated 10 months ago
thu-coai / Agent-SafetyBench
☆52Updated 3 weeks ago
QinbinLi / LLM-PBE
A toolkit to assess data privacy in LLMs (under development)
☆62Updated 8 months ago
Princeton-SysML / Jailbreak_LLM
☆182Updated last year
ydyjya / LLM-IHS-Explanation
☆51Updated last year
HKUST-KnowComp / PrivLM-Bench
Code for ACL 2024 paper: PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models.
☆13Updated 7 months ago
AI-secure / RedCode
[NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents
☆48Updated last month
facebookresearch / advprompter
Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873
☆163Updated last year
wonderNefelibata / Awesome-LRM-Safety
Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …
☆70Updated this week
sleeepeer / PoisonedRAG
[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
☆183Updated 6 months ago
kevinyaobytedance / llm_unlearn
LLM Unlearning
☆174Updated last year
chujiezheng / LLM-Safeguard
Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"
☆95Updated 3 months ago
thunlp / Advbench
Code and data of the EMNLP 2022 paper "Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversaria…
☆54Updated 2 years ago
AI-secure / AdvAgent
☆13Updated 3 months ago
boyiwei / alignment-attribution-code
[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
☆83Updated 5 months ago
AI45Lab / REEF
The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…
☆62Updated 7 months ago
usail-hkust / JailTrickBench
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)
☆147Updated 9 months ago
IAAR-Shanghai / SafeRAG
☆41Updated 5 months ago
IAAR-Shanghai / FastMem
Fast Memorization of Prompt Improves Context Awareness of Large Language Models (Findings of EMNLP 2024)
☆22Updated 10 months ago
OpenSafetyLab / SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
☆158Updated 5 months ago
Lordog / R-Judge
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)
☆87Updated 3 months ago
Meirtz / BabyBLUE-llm
[COLING 2025] Official repo of paper: "Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jail…
☆13Updated last year
zhuohaoyu / KIEval
[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
☆37Updated last year