xinleihe / toxic-promptView external linksLinks
☆27Nov 20, 2023Updated 2 years ago
Alternatives and similar repositories for toxic-prompt
Users that are interested in toxic-prompt are comparing it to the libraries listed below
Sorting:
- [S&P'24] Test-Time Poisoning Attacks Against Test-Time Adaptation Models☆19Feb 18, 2025Updated 11 months ago
- RAG-based chatbot for retail e-commerce.☆30Dec 1, 2024Updated last year
- https://icml.cc/virtual/2023/poster/24354☆10Aug 15, 2023Updated 2 years ago
- ☆10Dec 30, 2021Updated 4 years ago
- ☆12Dec 9, 2020Updated 5 years ago
- ☆11Jan 2, 2020Updated 6 years ago
- [Preprint] On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping☆10Feb 27, 2020Updated 5 years ago
- ☆13Oct 20, 2022Updated 3 years ago
- codes for paper "learning to discriminate perturbations for blocking adversarial attacks in text classification" in EMNLP19☆15Feb 25, 2020Updated 5 years ago
- ☆15Updated this week
- Caffe code for the paper "Adversarial Manipulation of Deep Representations"☆17Nov 6, 2017Updated 8 years ago
- Deep Learning (a.k.a. Recent Trends in Machine Learning) course at dsai.asia☆18Apr 21, 2023Updated 2 years ago
- Recommend products or brands to users based on browsing history data☆13Dec 18, 2020Updated 5 years ago
- ☆15Feb 21, 2024Updated last year
- Github implementation of https://reports.chatclimate.ai/☆23Jun 16, 2025Updated 8 months ago
- 🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Con…☆50Dec 20, 2023Updated 2 years ago
- Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confid…☆23May 8, 2023Updated 2 years ago
- ☆18Jul 1, 2021Updated 4 years ago
- Code for "CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples" (NDSS 2020)☆22Nov 14, 2020Updated 5 years ago
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆57Aug 17, 2024Updated last year
- TFLlib-Trustworthy Federated Learning Library and Benchmark☆62Nov 15, 2025Updated 3 months ago
- ☆25Aug 18, 2023Updated 2 years ago
- Synthetic data generation for TODs☆23Jul 17, 2024Updated last year
- ☆70Feb 4, 2024Updated 2 years ago
- Official repository for "Cactus: Towards Psychological Counseling Conversations using Cognitive Behavioral Theory" accepted at EMNLP Find…☆32Oct 1, 2024Updated last year
- ☆26Dec 1, 2022Updated 3 years ago
- ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark☆47Sep 2, 2025Updated 5 months ago
- ☆28Aug 21, 2023Updated 2 years ago
- The code and data for "Are Large Pre-Trained Language Models Leaking Your Personal Information?" (Findings of EMNLP '22)☆28Oct 31, 2022Updated 3 years ago
- Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022☆32Jul 11, 2022Updated 3 years ago
- ☆25Nov 14, 2022Updated 3 years ago
- ☆13Feb 17, 2025Updated last year
- Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT☆35Oct 15, 2023Updated 2 years ago
- Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks (IEEE S&P 2024)☆34Jun 29, 2025Updated 7 months ago
- Detection of adversarial examples using influence functions and nearest neighbors☆37Nov 22, 2022Updated 3 years ago
- Code for "Zero-Shot Out-of-Distribution Detection with Feature Correlations"☆13Jan 19, 2020Updated 6 years ago
- Tool for testing IPv4 and IPv6 DHCP services☆13Mar 27, 2020Updated 5 years ago
- ☆12Dec 22, 2025Updated last month
- Introduction to Random Forest Algorithm for classification problem and how to select important feaatures in your dataset.☆12Aug 1, 2020Updated 5 years ago