π₯π₯π₯ Detecting hidden backdoors in Large Language Models with only black-box access
β56Jun 2, 2025Updated 11 months ago
Alternatives and similar repositories for BAIT
Users that are interested in BAIT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- β16Sep 4, 2024Updated last year
- Siren: Byzantine-robust Federated Learning via Proactive Alarming (SoCC '21)β11Mar 28, 2024Updated 2 years ago
- This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Securβ¦β11Aug 24, 2022Updated 3 years ago
- β18Aug 15, 2022Updated 3 years ago
- β20Feb 11, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- [NDSS'23] BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defenseβ17May 7, 2024Updated 2 years ago
- Official Implementation of NeurIPS 2024 paper - BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokensβ29Feb 17, 2026Updated 3 months ago
- β14Feb 26, 2025Updated last year
- β26Dec 1, 2022Updated 3 years ago
- [NeurIPS 2025] BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Modelsβ299Mar 13, 2026Updated 2 months ago
- [Oakland 2024] Exploring the Orthogonality and Linearity of Backdoor Attacksβ29Apr 15, 2025Updated last year
- Implement of Implicit Knowledge Extraction Attack.β23Apr 17, 2026Updated last month
- Backdooring Neural Code Searchβ14Sep 8, 2023Updated 2 years ago
- Composite Backdoor Attacks Against Large Language Modelsβ25Apr 12, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [AAAI'21] Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxificationβ30Dec 31, 2024Updated last year
- [NDSS 2025] "CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models"β26Aug 20, 2025Updated 9 months ago
- [USENIX Security 2025] SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacksβ21Sep 18, 2025Updated 8 months ago
- Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107β21Aug 10, 2024Updated last year
- Official repository for CVPR'23 paper: Detecting Backdoors in Pre-trained Encodersβ38Sep 25, 2023Updated 2 years ago
- β18Jun 15, 2021Updated 4 years ago
- [ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Modelsβ52Jul 24, 2024Updated last year
- β27Aug 28, 2024Updated last year
- Implementation of "Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches"β25Aug 31, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Example TrojAI Submissionβ27Dec 6, 2024Updated last year
- Official repo for FSE'24 paper "CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking"β19Mar 10, 2025Updated last year
- Code for AAAI 2021 "Towards Feature Space Adversarial Attack".β30Aug 24, 2021Updated 4 years ago
- β13May 1, 2024Updated 2 years ago
- β19Mar 9, 2024Updated 2 years ago
- Code for NDSS 2022 paper "MIRROR: Model Inversion for Deep Learning Network with High Fidelity"β27May 9, 2023Updated 3 years ago
- [ICLR 2023, Best Paper Award at ECCVβ22 AROW Workshop] FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learningβ59Dec 11, 2024Updated last year
- Code for paper "The Philosopherβs Stone: Trojaning Plugins of Large Language Models"β30Sep 11, 2024Updated last year
- β38Oct 17, 2024Updated last year
- End-to-end encrypted email - Proton Mail β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- β16May 23, 2024Updated last year
- [ECCV'24] UNIT: Backdoor Mitigation via Automated Neural Distribution Tighteningβ10Dec 18, 2025Updated 5 months ago
- [NDSS 2025] CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Samplingβ17Jan 18, 2025Updated last year
- β12May 27, 2022Updated 3 years ago
- β26Jan 5, 2026Updated 4 months ago
- This is the implementation for CVPR 2022 Oral paper "Better Trigger Inversion Optimization in Backdoor Scanning."β24Apr 5, 2022Updated 4 years ago
- This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Langβ¦β22Jul 3, 2024Updated last year