ColinLu50 / Evade-GPT-Detector
Source code for paper **Large Language Models can be Guided to Evade AI-Generated Text Detection**
☆35Updated last year
Alternatives and similar repositories for Evade-GPT-Detector:
Users that are interested in Evade-GPT-Detector are comparing it to the libraries listed below
- SeqXGPT: An advance method for sentence-level AI-generated text detection.☆87Updated last year
- LLMDet is a text detection tool that can identify which generated sources the text came from (e.g. large language model or human-write).☆69Updated 10 months ago
- [TACL] Code for "Red Teaming Language Model Detectors with Language Models"☆19Updated last year
- Official repository for our NeurIPS 2023 paper "Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense…☆166Updated last year
- (NAACL 2024) Official code repository for Mixset.☆24Updated 4 months ago
- COLING'24 Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack☆46Updated last year
- Can AI-Generated Text be Reliably Detected?☆76Updated last year
- Code for our NeurIPS2023 accepted paper: RADAR: Robust AI-Text Detection via Adversarial Learning. We tested RADAR on 8 LLMs including Vi…☆51Updated last year
- DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text☆29Updated last year
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆74Updated 2 weeks ago
- Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confid…☆23Updated last year
- Code for the paper: ConDA: Contrastive Domain Adaptation for AI-generated Text Detection☆37Updated last year
- A survey and reflection on the latest research breakthroughs in LLM-generated Text detection, including data, detectors, metrics, current…☆72Updated 5 months ago
- LLM Unlearning☆153Updated last year
- Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM☆29Updated 3 months ago
- Continuously updated list of related resources for generative LLMs like GPT and their analysis and detection.☆219Updated last week
- Offiical codes for DNA-GPT (ICLR 2024)☆50Updated last year
- Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models☆28Updated last year
- ☆128Updated 7 months ago
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"☆69Updated 8 months ago
- Weak-to-Strong Jailbreaking on Large Language Models☆73Updated last year
- Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]☆70Updated 6 months ago
- Code and data of the EMNLP 2022 paper "Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversaria…☆50Updated 2 years ago
- ☆38Updated last month
- Hide and Seek (HaS): A Framework for Prompt Privacy Protection☆39Updated last year
- Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"☆90Updated 7 months ago
- A lightweight library for large laguage model (LLM) jailbreaking defense.☆51Updated 6 months ago
- Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022☆29Updated 2 years ago
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆67Updated 2 years ago
- Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"☆52Updated last year