xinleihe/toxic-prompt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xinleihe/toxic-prompt)

xinleihe / toxic-prompt

☆27

Alternatives and similar repositories for toxic-prompt

Users that are interested in toxic-prompt are comparing it to the libraries listed below

Sorting:

unicamp-dl / retailGPT
View on GitHub
RAG-based chatbot for retail e-commerce.
☆31Dec 1, 2024Updated last year
tianshuocong / TePA
View on GitHub
[S&P'24] Test-Time Poisoning Attacks Against Test-Time Adaptation Models
☆19Feb 18, 2025Updated last year
zh1yu4nyu / CodeIPPrompt
View on GitHub
https://icml.cc/virtual/2023/poster/24354
☆10Aug 15, 2023Updated 2 years ago
arpan65 / Insurance-RAG-Chatbot
View on GitHub
Insurance-RAG-Chatbot(IVA): An open-source project featuring a retrieval-augmented chatbot developed using Bedrock, LLM, LangChain, Docke…
☆22May 30, 2024Updated last year
xinleihe / ContrastiveLeaks
View on GitHub
☆10Dec 30, 2021Updated 4 years ago
YitingQu / meme-evolution
View on GitHub
☆14Jul 26, 2024Updated last year
jinyuan-jia / Certify_Topk
View on GitHub
☆11Jan 2, 2020Updated 6 years ago
Sanghyun-Hong / Gradient-Shaping
View on GitHub
[Preprint] On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping
☆10Feb 27, 2020Updated 6 years ago
joey1993 / bert-defender
View on GitHub
codes for paper "learning to discriminate perturbations for blocking adversarial attacks in text classification" in EMNLP19
☆15Feb 25, 2020Updated 6 years ago
mireshghallah / ft-memorization
View on GitHub
☆13Oct 20, 2022Updated 3 years ago
dsai-asia / RTML
View on GitHub
Deep Learning (a.k.a. Recent Trends in Machine Learning) course at dsai.asia
☆18Apr 21, 2023Updated 2 years ago
MinnieTan / Recommender-System-Based-on-Purchasing-Behavior-Data
View on GitHub
Recommend products or brands to users based on browsing history data
☆13Dec 18, 2020Updated 5 years ago
tianshuocong / SSLGuard
View on GitHub
[CCS'22] SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained Encoders
☆18Jul 12, 2022Updated 3 years ago
EdisonNi-hku / chatreport
View on GitHub
Github implementation of https://reports.chatclimate.ai/
☆23Jun 16, 2025Updated 8 months ago
skywalker023 / confaide
View on GitHub
🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Con…
☆50Dec 20, 2023Updated 2 years ago
yunyuntsai / DNN-Model-Stealing
View on GitHub
Code for "CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples" (NDSS 2020)
☆22Nov 14, 2020Updated 5 years ago
thu-coai / Targeted-Data-Extraction
View on GitHub
Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confid…
☆23May 8, 2023Updated 2 years ago
ShannonAI / backdoor_nlg
View on GitHub
☆18Jul 1, 2021Updated 4 years ago
chawins / pal
View on GitHub
PAL: Proxy-Guided Black-Box Attack on Large Language Models
☆57Aug 17, 2024Updated last year
abhisheksambyal / Self-supervised-learning-by-context-prediction
View on GitHub
Implementation of "Unsupervised Visual Representation Learning by Context Prediction" by C. Doersh, A. Gupta and A. A. Efros
☆24Nov 18, 2021Updated 4 years ago
mireshghallah / neighborhood-curvature-mia
View on GitHub
☆24Aug 18, 2023Updated 2 years ago
T1aNS1R / Evil-Geniuses
View on GitHub
☆70Feb 4, 2024Updated 2 years ago
PurduePAML / PICCOLO
View on GitHub
☆26Dec 1, 2022Updated 3 years ago
thunlp / NeuBA
View on GitHub
☆25Jun 23, 2021Updated 4 years ago
cchoquette / membership-inference
View on GitHub
Code for the paper: Label-Only Membership Inference Attacks
☆68Sep 11, 2021Updated 4 years ago
JacksonWuxs / PromptRec
View on GitHub
Prompting Small Language Models for Personalized Cold-Start Recommendation
☆31Mar 9, 2024Updated 2 years ago
jeffhj / LM_PersonalInfoLeak
View on GitHub
The code and data for "Are Large Pre-Trained Language Models Leaking Your Personal Information?" (Findings of EMNLP '22)
☆28Oct 31, 2022Updated 3 years ago
Hazelsuko07 / TextHide
View on GitHub
TextHide: Tackling Data Privacy in Language Understanding Tasks
☆31Apr 19, 2021Updated 4 years ago
DennisLiu2022 / Membership-Inference-Attacks-by-Exploiting-Loss-Trajectory
View on GitHub
☆25Nov 14, 2022Updated 3 years ago
ml-research / Q16
View on GitHub
☆35May 22, 2024Updated last year
tongzeliang / EvoPrompt
View on GitHub
☆13Feb 17, 2025Updated last year
joeljang / knowledge-unlearning
View on GitHub
[ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models
☆87Sep 12, 2024Updated last year
Rawan19 / Conversational-AI-Chatbot-for-the-Elderly-and-Disabled
View on GitHub
A virtual caregiver system that extracts the expression of mental and physical health states through dialogue-based human-computer intera…
☆14Jan 29, 2023Updated 3 years ago
YouyiSong / Codes-for-Selective-Learning
View on GitHub
Codes for MICCAI 2021 Paper: Selective Learning from External Data for CT Image Segmentation
☆12Oct 10, 2021Updated 4 years ago
Eyr3 / TextCRS
View on GitHub
Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks (IEEE S&P 2024)
☆34Jun 29, 2025Updated 8 months ago
CFWP / FMradio
View on GitHub
Factor Modeling for radiomics
☆12Aug 29, 2025Updated 6 months ago
Greysahy / ipiguard
View on GitHub
[EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
☆16Sep 16, 2025Updated 5 months ago
ArnesSI / dhcpdoctor
View on GitHub
Tool for testing IPv4 and IPv6 DHCP services
☆13Mar 27, 2020Updated 5 years ago
xhan77 / veiled-toxicity-detection
View on GitHub
Fortifying Toxic Speech Detectors Against Veiled Toxicity
☆11Oct 21, 2020Updated 5 years ago