Social-AI-Studio / ToxiCloakCN
Official repository for EMNLP'24 paper "ToxiCloakCN: Evaluating Robustness of Offensive Language Detection in Chinese with Cloaking Perturbations"
☆34Updated 4 months ago
Alternatives and similar repositories for ToxiCloakCN:
Users that are interested in ToxiCloakCN are comparing it to the libraries listed below
- ☆24Updated last year
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆103Updated 5 months ago
- ☆70Updated last month
- Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models☆26Updated last year
- The code implementation of the paper CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Low Resource With Contrastive Learni…☆14Updated 10 months ago
- ☆28Updated 8 months ago
- Code for paper 'Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning'☆16Updated 10 months ago
- Implementation of "ACL'24: When Do LLMs Need Retrieval Augmentation? Mitigating LLMs’ Overconfidence Helps Retrieval Augmentation"☆21Updated 7 months ago
- ☆29Updated 4 months ago
- ☆12Updated 5 months ago
- A collection of survey papers and resources related to Large Language Models (LLMs).☆40Updated last year
- Yelp Simulator for WWW'25 AgentSociety Challenge☆65Updated this week
- ☆80Updated last year
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆75Updated last year
- ☆18Updated 7 months ago
- [ICLR'24 Spotlight] The official codes of our work on AIGC detection: "Multiscale Positive-Unlabeled Detection of AI-Generated Texts"☆116Updated last year
- ☆14Updated 7 months ago
- The reinforcement learning codes for dataset SPA-VL☆28Updated 7 months ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆124Updated 2 months ago
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆28Updated 3 months ago
- Long Form NLG Generation Based on Large Language Models☆14Updated last year
- ☆125Updated last year
- ☆24Updated last year
- The code and resource of "Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmark"…☆61Updated 2 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆110Updated 3 months ago
- SEA is an automated paper review framework capable of generating comprehensive and high-quality review feedback with high consistency for…☆58Updated 2 months ago
- Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.☆27Updated 3 months ago
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆117Updated 8 months ago