IBM / RADARLinks
Code for our NeurIPS2023 accepted paper: RADAR: Robust AI-Text Detection via Adversarial Learning. We tested RADAR on 8 LLMs including Vicuna and LLaMA. The results show that RADAR can attain good detection performance on LLM-generated AI-text while being robust against paraphrasing.
☆66Updated last month
Alternatives and similar repositories for RADAR
Users that are interested in RADAR are comparing it to the libraries listed below
Sorting:
- SeqXGPT: An advance method for sentence-level AI-generated text detection.☆94Updated 2 years ago
- A survey and reflection on the latest research breakthroughs in LLM-generated Text detection, including data, detectors, metrics, current…☆79Updated 11 months ago
- RAID is the largest and most challenging benchmark for AI-generated text detection. (ACL 2024)☆93Updated 3 weeks ago
- Can AI-Generated Text be Reliably Detected?☆86Updated last year
- Code/data for MARG (multi-agent review generation)☆57Updated last month
- A survey and reflection on the latest research breakthroughs in LLM-generated Text detection, including data, detectors, metrics, current…☆235Updated 10 months ago
- [AAAI 2024] The official repository for our paper, "OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially …☆50Updated this week
- (NAACL 2024) Official code repository for Mixset.☆27Updated 11 months ago
- ☆30Updated last year
- Official repository for our NeurIPS 2023 paper "Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense…☆179Updated 2 years ago
- ☆34Updated last year
- Ghostbuster: Detecting Text Ghostwritten by Large Language Models (NAACL 2024)☆164Updated last year
- The dataset and code for the ICLR 2024 paper "Can LLM-Generated Misinformation Be Detected?"☆77Updated last year
- Continuously updated list of related resources for generative LLMs like GPT and their analysis and detection.☆228Updated 5 months ago
- Benchmarking LLMs' Psychological Portrayal☆126Updated 10 months ago
- Recent papers on (1) Psychology of LLMs; (2) Biases in LLMs.☆50Updated 2 years ago
- The official implementation of our NAACL 2024 paper "A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Lang…☆141Updated 2 months ago
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique☆18Updated last year
- Paper list for the survey "Combating Misinformation in the Age of LLMs: Opportunities and Challenges" and the initiative "LLMs Meet Misin…☆103Updated last year
- The lastest paper about detection of LLM-generated text and code☆280Updated 4 months ago
- DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text☆31Updated 2 years ago
- [ACL24] EmoBench: Evaluating the Emotional Intelligence of Large Language Models☆97Updated 5 months ago
- LLMDet is a text detection tool that can identify which generated sources the text came from (e.g. large language model or human-write).☆80Updated last year
- ☆109Updated 6 months ago
- [TACL] Code for "Red Teaming Language Model Detectors with Language Models"☆23Updated last year
- [ICML 2025] Weak-to-Strong Jailbreaking on Large Language Models☆88Updated 6 months ago
- ☆47Updated 7 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆109Updated last year
- [ACL 2025] Knowledge Unlearning for Large Language Models☆46Updated last month
- [EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"☆19Updated last year