π₯π₯π₯ Detecting hidden backdoors in Large Language Models with only black-box access
β55Jun 2, 2025Updated 10 months ago
Alternatives and similar repositories for BAIT
Users that are interested in BAIT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [IEEE S&P'24] ODSCAN: Backdoor Scanning for Object Detection Modelsβ22Oct 5, 2025Updated 6 months ago
- β16Sep 4, 2024Updated last year
- β16Dec 29, 2023Updated 2 years ago
- Siren: Byzantine-robust Federated Learning via Proactive Alarming (SoCC '21)β11Mar 28, 2024Updated 2 years ago
- This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Securβ¦β11Aug 24, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- β18Aug 15, 2022Updated 3 years ago
- β20Feb 11, 2024Updated 2 years ago
- Official Implementation of NeurIPS 2024 paper - BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokensβ29Feb 17, 2026Updated last month
- Distribution Preserving Backdoor Attack in Self-supervised Learningβ20Jan 27, 2024Updated 2 years ago
- β14Feb 26, 2025Updated last year
- β26Dec 1, 2022Updated 3 years ago
- [NeurIPS 2025] BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Modelsβ287Mar 13, 2026Updated 3 weeks ago
- [Oakland 2024] Exploring the Orthogonality and Linearity of Backdoor Attacksβ29Apr 15, 2025Updated 11 months ago
- Implement of Implicit Knowledge Extraction Attack.β21May 28, 2025Updated 10 months ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Backdooring Neural Code Searchβ14Sep 8, 2023Updated 2 years ago
- Composite Backdoor Attacks Against Large Language Modelsβ25Apr 12, 2024Updated last year
- [AAAI'21] Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxificationβ29Dec 31, 2024Updated last year
- [NDSS 2025] "CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models"β26Aug 20, 2025Updated 7 months ago
- [USENIX Security 2025] SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacksβ20Sep 18, 2025Updated 6 months ago
- Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107β20Aug 10, 2024Updated last year
- Official repository for CVPR'23 paper: Detecting Backdoors in Pre-trained Encodersβ36Sep 25, 2023Updated 2 years ago
- β18Jun 15, 2021Updated 4 years ago
- [ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Modelsβ50Jul 24, 2024Updated last year
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- β27Aug 28, 2024Updated last year
- Example TrojAI Submissionβ27Dec 6, 2024Updated last year
- Official repo for FSE'24 paper "CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking"β18Mar 10, 2025Updated last year
- Nyx: Detecting Exploitable Front-Running Vulnerabilities in Smart Contractsβ22May 11, 2024Updated last year
- Code for AAAI 2021 "Towards Feature Space Adversarial Attack".β30Aug 24, 2021Updated 4 years ago
- β13May 1, 2024Updated last year
- β19Mar 9, 2024Updated 2 years ago
- Code for NDSS 2022 paper "MIRROR: Model Inversion for Deep Learning Network with High Fidelity"β27May 9, 2023Updated 2 years ago
- [ICLR 2023, Best Paper Award at ECCVβ22 AROW Workshop] FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learningβ60Dec 11, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting β’ AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code for paper "The Philosopherβs Stone: Trojaning Plugins of Large Language Models"β29Sep 11, 2024Updated last year
- β37Oct 17, 2024Updated last year
- β19Feb 25, 2024Updated 2 years ago
- β16May 23, 2024Updated last year
- [ECCV'24] UNIT: Backdoor Mitigation via Automated Neural Distribution Tighteningβ10Dec 18, 2025Updated 3 months ago
- [NDSS 2025] CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Samplingβ17Jan 18, 2025Updated last year
- β12May 27, 2022Updated 3 years ago