YichenZW / Robust-Det
The code implementation of the paper Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks (ACL 2024 main) by Yichen Wang, Shangbin Feng, Abe Bohan Hou, Xiao Pu, Chao Shen, Xiaoming Liu, and Yulia Tsvetkov, and Tianxing He, mainly at Paul G. Allen School of CSE, University of Washington.
☆11Updated 9 months ago
Alternatives and similar repositories for Robust-Det:
Users that are interested in Robust-Det are comparing it to the libraries listed below
- AbstainQA, ACL 2024☆25Updated 6 months ago
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆58Updated last year
- ☆41Updated last year
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆30Updated 5 months ago
- ☆19Updated last year
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆53Updated 5 months ago
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆67Updated 2 years ago
- Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"☆63Updated last year
- DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text☆29Updated last year
- ☆25Updated 7 months ago
- Recent papers on (1) Psychology of LLMs; (2) Biases in LLMs.☆48Updated last year
- Code for the ACL 2023 Paper "Fact-Checking Complex Claims with Program-Guided Reasoning"☆54Updated last year
- Public code repo for COLING 2025 paper "Aligning LLMs with Individual Preferences via Interaction"☆26Updated 3 weeks ago
- Official code for ICML 2024 paper on Persona In-Context Learning (PICLe)☆23Updated 9 months ago
- ☆41Updated 11 months ago
- [EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"☆19Updated 6 months ago
- [ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.☆99Updated 2 years ago
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆13Updated last year
- Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Mod…☆36Updated last year
- ☆25Updated last year
- ☆54Updated last month
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆57Updated last year
- The dataset and code for the ICLR 2024 paper "Can LLM-Generated Misinformation Be Detected?"☆63Updated 5 months ago
- Code for the 2024 arXiv publication "Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Mo…☆24Updated 9 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆109Updated last year
- Restore safety in fine-tuned language models through task arithmetic☆28Updated last year
- ☆49Updated last year
- EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue☆35Updated 5 months ago
- ☆42Updated 5 months ago
- Source code of our paper MIND, ACL 2024 Long Paper☆39Updated 10 months ago