Starter kit and data loading code for the Trojan Detection Challenge NeurIPS 2022 competition
☆32Jul 26, 2023Updated 2 years ago
Alternatives and similar repositories for tdc-starter-kit
Users that are interested in tdc-starter-kit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.☆91May 19, 2024Updated 2 years ago
- A text-based game where language models learn to lie and to detect lies.☆12Oct 4, 2023Updated 2 years ago
- ☆10Oct 31, 2022Updated 3 years ago
- The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on …☆20Apr 27, 2023Updated 3 years ago
- [CVPR 2022] "Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free" by Tianlong Chen*, Zhenyu Zhang*, Yihua Zhang*, Shiyu C…☆27Oct 5, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Example TrojAI Submission☆27Dec 6, 2024Updated last year
- ☆12May 27, 2022Updated 4 years ago
- Jiminy Cricket Environment (NeurIPS 2021)☆25Feb 12, 2022Updated 4 years ago
- Official repository for CVPR'23 paper: Detecting Backdoors in Pre-trained Encoders☆38Sep 25, 2023Updated 2 years ago
- Redwood Research's transformer interpretability tools☆15Apr 15, 2022Updated 4 years ago
- ☆26Dec 1, 2022Updated 3 years ago
- ☆18Jun 15, 2021Updated 5 years ago
- This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…☆19Jun 7, 2023Updated 3 years ago
- [NeurIPS 2022] "Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets" by Ruisi Cai*, Zhenyu Zh…☆21Oct 1, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆28Feb 1, 2023Updated 3 years ago
- This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Lang…☆22Jul 3, 2024Updated last year
- ☆19Mar 9, 2024Updated 2 years ago
- TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classifica…☆303Aug 25, 2025Updated 9 months ago
- This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Secur…☆11Aug 24, 2022Updated 3 years ago
- Demo code for the paper: One Thing to Fool them All: Generating Interpretable, Universal, and Physically-Realizable Adversarial Features☆12Nov 30, 2023Updated 2 years ago
- Codes for reproducing the results of the paper "Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness" published at IC…☆27Apr 29, 2020Updated 6 years ago
- ABS: Scanning Neural Networks for Back-doors by Artificial Brain Stimulation☆51Jun 1, 2022Updated 4 years ago
- Implemention of "Piracy Resistant Watermarks for Deep Neural Networks" in TensorFlow.☆12Dec 5, 2020Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [CVPR 2023] The official implementation of our CVPR 2023 paper "Detecting Backdoors During the Inference Stage Based on Corruption Robust…☆25May 25, 2023Updated 3 years ago
- This is the implementation for CVPR 2022 Oral paper "Better Trigger Inversion Optimization in Backdoor Scanning."☆24Apr 5, 2022Updated 4 years ago
- ☆22Jun 18, 2025Updated last year
- Bullseye Polytope Clean-Label Poisoning Attack☆18Nov 5, 2020Updated 5 years ago
- This work corroborates a run-time Trojan detection method exploiting STRong Intentional Perturbation of inputs, is a multi-domain Trojan …☆10Mar 7, 2021Updated 5 years ago
- Codes for the ICLR 2022 paper: Trigger Hunting with a Topological Prior for Trojan Detection☆11Sep 19, 2023Updated 2 years ago
- ☆29Jun 17, 2024Updated 2 years ago
- This is the official implementation of our paper Untargeted Backdoor Attack against Object Detection.☆27Mar 6, 2023Updated 3 years ago
- [AAAI'21] Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification☆30Dec 31, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This repository contains code developed by the SRI team for the IARPA/TrojAI program.☆21Jul 1, 2021Updated 4 years ago
- ☆20Feb 17, 2023Updated 3 years ago
- ☆11Jan 25, 2022Updated 4 years ago
- Benchmark to estimate model sycophancy☆30Nov 30, 2025Updated 6 months ago
- Machine Learning & Security Seminar @Purdue University☆25May 9, 2023Updated 3 years ago
- this is for the ACM MM paper---Backdoor Attack on Crowd Counting☆17Jul 10, 2022Updated 3 years ago
- The Happy Faces Benchmark☆15Jul 20, 2023Updated 2 years ago