Weixin-Liang / ChatGPT-Detector-BiasLinks
☆39Updated last year
Alternatives and similar repositories for ChatGPT-Detector-Bias
Users that are interested in ChatGPT-Detector-Bias are comparing it to the libraries listed below
Sorting:
- Official repository for our NeurIPS 2023 paper "Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense…☆184Updated 2 years ago
- Can AI-Generated Text be Reliably Detected?☆86Updated 2 years ago
- Transformer-based model for learning authorship representations.☆46Updated last year
- Official Repository for Dataset Inference for LLMs☆43Updated last year
- The code and data for "Are Large Pre-Trained Language Models Leaking Your Personal Information?" (Findings of EMNLP '22)☆26Updated 3 years ago
- ☆116Updated last year
- DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature☆445Updated 2 years ago
- Continuously updated list of related resources for generative LLMs like GPT and their analysis and detection.☆229Updated 7 months ago
- ☆57Updated last year
- ☆152Updated 3 years ago
- ☆226Updated 4 years ago
- ☆59Updated 2 years ago
- ☆13Updated 3 years ago
- Repository for the Bias Benchmark for QA dataset.☆133Updated 2 years ago
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆120Updated 10 months ago
- A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)☆26Updated 4 years ago
- Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper☆85Updated 4 years ago
- Data for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder"☆20Updated 2 years ago
- [ICML 2025] Weak-to-Strong Jailbreaking on Large Language Models☆91Updated 8 months ago
- Training data extraction on GPT-2☆194Updated 2 years ago
- ☆44Updated 2 years ago
- 👩💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"☆20Updated last year
- Aligning AI With Shared Human Values (ICLR 2021)☆308Updated 2 years ago
- Recent papers on (1) Psychology of LLMs; (2) Biases in LLMs.☆50Updated 2 years ago
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆65Updated 2 years ago
- ☆39Updated 2 years ago
- ☆48Updated 11 months ago
- Official repository for "PostMark: A Robust Blackbox Watermark for Large Language Models"☆27Updated last year
- Codes and datasets of the paper Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment☆108Updated last year
- ☆28Updated last year