agiresearch / EmojiCrypt
EmojiCrypt: Prompt Encryption for Secure Communication with Large Language Models
☆13Updated last year
Alternatives and similar repositories for EmojiCrypt:
Users that are interested in EmojiCrypt are comparing it to the libraries listed below
- Hide and Seek (HaS): A Framework for Prompt Privacy Protection☆36Updated last year
- LLM Unlearning☆144Updated last year
- ☆50Updated last month
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆120Updated 7 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆130Updated 3 weeks ago
- ☆18Updated 11 months ago
- ☆42Updated 9 months ago
- ICLR2024 Paper. Showing properties of safety tuning and exaggerated safety.☆77Updated 10 months ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆58Updated last month
- [NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey☆89Updated 7 months ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆89Updated 9 months ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆132Updated this week
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆106Updated 5 months ago
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆66Updated last month
- ☆117Updated 6 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆206Updated 4 months ago
- ☆23Updated 10 months ago
- Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]☆62Updated 5 months ago
- ☆52Updated 3 weeks ago
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆218Updated last year
- Source code of our paper MIND, ACL 2024 Long Paper☆37Updated 9 months ago
- The dataset and code for the ICLR 2024 paper "Can LLM-Generated Misinformation Be Detected?"☆55Updated 4 months ago
- Weak-to-Strong Jailbreaking on Large Language Models☆72Updated last year
- A lightweight library for large laguage model (LLM) jailbreaking defense.☆47Updated 4 months ago
- Paper list for the survey "Combating Misinformation in the Age of LLMs: Opportunities and Challenges" and the initiative "LLMs Meet Misin…☆97Updated 4 months ago
- The code for paper "The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)", exploring the privacy risk o…☆42Updated last month
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆151Updated last year
- A toolkit to assess data privacy in LLMs (under development)☆54Updated 2 months ago
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆129Updated last year
- Benchmarking LLMs' Psychological Portrayal☆110Updated 2 months ago