shizhouxing / LLM-Detector-Robustness
[TACL] Code for "Red Teaming Language Model Detectors with Language Models"
☆16Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for LLM-Detector-Robustness
- ☆38Updated last year
- Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confid…☆23Updated last year
- Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022☆27Updated 2 years ago
- Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888☆36Updated 5 months ago
- DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text☆25Updated last year
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024☆62Updated last month
- [ACL 2024] Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models☆37Updated 2 months ago
- Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"☆71Updated 2 months ago
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆47Updated last month
- Official code for the paper: Evaluating Copyright Takedown Methods for Language Models☆15Updated 4 months ago
- ☆39Updated last month
- Restore safety in fine-tuned language models through task arithmetic☆26Updated 7 months ago
- ☆31Updated last year
- Multilingual safety benchmark for Large Language Models☆24Updated 2 months ago
- ☆33Updated last year
- Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models☆23Updated last year
- Paper list for the survey "Combating Misinformation in the Age of LLMs: Opportunities and Challenges" and the initiative "LLMs Meet Misin…☆85Updated last week
- Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs☆31Updated 9 months ago
- ☆16Updated 4 months ago
- Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"☆60Updated 8 months ago
- ☆39Updated last year
- Official code implementation of SKU, Accepted by ACL 2024 Findings☆11Updated 6 months ago
- ☆26Updated 6 months ago
- We have released the code and demo program required for LLM with self-verification☆49Updated last year
- AbstainQA, ACL 2024☆19Updated last month
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆84Updated 5 months ago
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆65Updated 2 years ago
- Personality Alignment of Language Models☆18Updated 2 months ago
- [EMNLP 2024 Findings] To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models☆19Updated last week
- Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"☆46Updated last year