🔥🔥🔥 Detecting hidden backdoors in Large Language Models with only black-box access
☆58Jun 2, 2025Updated last year
Alternatives and similar repositories for BAIT
Users that are interested in BAIT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [IEEE S&P'24] ODSCAN: Backdoor Scanning for Object Detection Models☆22Oct 5, 2025Updated 8 months ago
- ☆16Sep 4, 2024Updated last year
- ☆16Dec 29, 2023Updated 2 years ago
- Siren: Byzantine-robust Federated Learning via Proactive Alarming (SoCC '21)☆11Mar 28, 2024Updated 2 years ago
- This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Secur…☆11Aug 24, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆18Aug 15, 2022Updated 3 years ago
- ☆20Feb 11, 2024Updated 2 years ago
- [NDSS'23] BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense☆17May 7, 2024Updated 2 years ago
- Official Implementation of NeurIPS 2024 paper - BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens☆29Feb 17, 2026Updated 4 months ago
- Distribution Preserving Backdoor Attack in Self-supervised Learning☆20Jan 27, 2024Updated 2 years ago
- ☆15Feb 26, 2025Updated last year
- [NeurIPS 2025] BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models☆312Mar 13, 2026Updated 3 months ago
- [Oakland 2024] Exploring the Orthogonality and Linearity of Backdoor Attacks☆29Apr 15, 2025Updated last year
- Implement of Implicit Knowledge Extraction Attack.☆23Apr 17, 2026Updated 2 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Backdooring Neural Code Search☆14Sep 8, 2023Updated 2 years ago
- Composite Backdoor Attacks Against Large Language Models☆25Apr 12, 2024Updated 2 years ago
- [AAAI'21] Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification☆30Dec 31, 2024Updated last year
- [NDSS 2025] "CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models"☆26Aug 20, 2025Updated 10 months ago
- [USENIX Security 2025] SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks☆22Sep 18, 2025Updated 9 months ago
- Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107☆21Aug 10, 2024Updated last year
- Official repository for CVPR'23 paper: Detecting Backdoors in Pre-trained Encoders☆38Sep 25, 2023Updated 2 years ago
- ☆18Jun 15, 2021Updated 5 years ago
- [ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models☆56Jul 24, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆27Aug 28, 2024Updated last year
- Implementation of "Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches"☆25Aug 31, 2022Updated 3 years ago
- Example TrojAI Submission☆27Dec 6, 2024Updated last year
- Nyx: Detecting Exploitable Front-Running Vulnerabilities in Smart Contracts☆23May 11, 2024Updated 2 years ago
- Code for AAAI 2021 "Towards Feature Space Adversarial Attack".☆30Aug 24, 2021Updated 4 years ago
- ☆13May 1, 2024Updated 2 years ago
- ☆19Mar 9, 2024Updated 2 years ago
- Code for NDSS 2022 paper "MIRROR: Model Inversion for Deep Learning Network with High Fidelity"☆27May 9, 2023Updated 3 years ago
- [ICLR 2023, Best Paper Award at ECCV’22 AROW Workshop] FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning☆59Dec 11, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code for paper "The Philosopher’s Stone: Trojaning Plugins of Large Language Models"☆33Sep 11, 2024Updated last year
- ☆38Oct 17, 2024Updated last year
- ☆19Feb 25, 2024Updated 2 years ago
- ☆16May 23, 2024Updated 2 years ago
- [NDSS 2025] CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling☆18Jan 18, 2025Updated last year
- ☆12May 27, 2022Updated 4 years ago
- This is the implementation for CVPR 2022 Oral paper "Better Trigger Inversion Optimization in Backdoor Scanning."☆24Apr 5, 2022Updated 4 years ago