☆17Jun 18, 2025Updated 8 months ago
Alternatives and similar repositories for PolyGuard
Users that are interested in PolyGuard are comparing it to the libraries listed below
Sorting:
- [ICML 2025] UDora: A Unified Red Teaming Framework against LLM Agents☆32Jun 24, 2025Updated 8 months ago
- Uncertainty Quantification with Pre-trained Language Models: An Empirical Analysis☆15Oct 11, 2022Updated 3 years ago
- ☆22Sep 17, 2024Updated last year
- This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…☆19Jun 7, 2023Updated 2 years ago
- [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆66Nov 14, 2025Updated 3 months ago
- Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs. EMNLP 2024☆27Nov 13, 2024Updated last year
- The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on …☆20Apr 27, 2023Updated 2 years ago
- Official code for the ICCV2023 paper ``One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training''☆20Aug 9, 2023Updated 2 years ago
- ☆56Oct 4, 2024Updated last year
- Comprehensive Assessment of Trustworthiness in Multimodal Foundation Models☆27Mar 15, 2025Updated 11 months ago
- [CCS 2021] TSS: Transformation-specific smoothing for robustness certification☆26Oct 3, 2023Updated 2 years ago
- Machine Learning & Security Seminar @Purdue University☆25May 9, 2023Updated 2 years ago
- ☆31Apr 8, 2020Updated 5 years ago
- On Memorization of Large Language Models in Logical Reasoning☆74Mar 29, 2025Updated 11 months ago
- See also APPL: https://github.com/appl-team/appl that improves this project. A Python package for writing Language Models prompts in a ne…☆37Oct 2, 2023Updated 2 years ago
- Implementation of the paper "MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation".☆31Dec 12, 2021Updated 4 years ago
- Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks (IEEE S&P 2024)☆34Jun 29, 2025Updated 7 months ago
- Consensus Based Distributed Stochastic Gradient Descent☆11Jun 24, 2018Updated 7 years ago
- Starter kit and data loading code for the Trojan Detection Challenge NeurIPS 2022 competition☆33Jul 26, 2023Updated 2 years ago
- Towards Memorization-Free Diffusion Models (CVPR2024) Codebase☆12Jun 2, 2024Updated last year
- https://icml.cc/virtual/2023/poster/24354☆10Aug 15, 2023Updated 2 years ago
- [ACL 2023] Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generati…☆10Sep 23, 2023Updated 2 years ago
- ☆12Dec 14, 2022Updated 3 years ago
- ☆11Jun 18, 2023Updated 2 years ago
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 3 years ago
- ☆10Oct 2, 2024Updated last year
- This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.☆90May 19, 2024Updated last year
- RAB: Provable Robustness Against Backdoor Attacks☆39Oct 3, 2023Updated 2 years ago
- The Swiss Court Ruling Corpus (SCRC) contains code for extracting information from Swiss court rulings☆11Jan 22, 2025Updated last year
- ☆10Oct 13, 2022Updated 3 years ago
- GisPy: A Tool for Measuring Gist Inference Score in Text https://aclanthology.org/2022.wnu-1.5/☆13Jul 1, 2024Updated last year
- Code implementation for paper AbsenceBench: Language Models Can't Tell What's Missing☆17Oct 23, 2025Updated 4 months ago
- Implemention of "Piracy Resistant Watermarks for Deep Neural Networks" in TensorFlow.☆12Dec 5, 2020Updated 5 years ago
- Implementation of the spotlight: a method for discovering systematic errors in deep learning models☆11Oct 5, 2021Updated 4 years ago
- [NeurIPS 2022] Explaining Graph Neural Networks with Structure-Aware Cooperative Games (GStarX)☆14Oct 20, 2022Updated 3 years ago
- ☆31Sep 19, 2025Updated 5 months ago
- TYPO3 Extension ⇢ Integration of sendinblue as finisher of the form extension☆11Jan 23, 2025Updated last year
- Official Implementation for "Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approxim…☆12Aug 14, 2024Updated last year
- Code of our paper "Method-Level Bug Severity Prediction using Source Code Metrics and LLMs" which is accepted to ISSRE 2023.☆10Nov 12, 2023Updated 2 years ago