joannahuadu / radiation-error-emulatorLinks
Simulator.
☆103Updated 2 months ago
Alternatives and similar repositories for radiation-error-emulator
Users that are interested in radiation-error-emulator are comparing it to the libraries listed below
Sorting:
- This is the code repository of our submission: Understanding the Dark Side of LLMs’ Intrinsic Self-Correction.☆56Updated 6 months ago
- [ACL 2024] The official GitHub repo for the paper "The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Pe…☆77Updated 11 months ago
- ☆29Updated 8 months ago
- ☆56Updated last month
- ☆223Updated last year
- Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks (IEEE S&P 2024)☆34Updated 2 weeks ago
- ☆18Updated 9 months ago
- [ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion☆47Updated 8 months ago
- A curated list of papers & resources on backdoor attacks and defenses in deep learning.☆213Updated last year
- ☆24Updated 10 months ago
- ☆82Updated last year
- ☆82Updated 3 years ago
- A survey on harmful fine-tuning attack for large language model☆192Updated last week
- This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"☆40Updated 7 months ago
- Official Code for ACL 2024 paper "GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis"☆57Updated 8 months ago
- [NDSS 2025] "CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models"☆15Updated 6 months ago
- Official Code for ART: Automatic Red-teaming for Text-to-Image Models to Protect Benign Users (NeurIPS 2024)☆16Updated 8 months ago
- Source code and scripts for the paper "Is Difficulty Calibration All We Need? Towards More Practical Membership Inference Attacks"☆18Updated 7 months ago
- Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"☆58Updated 2 years ago
- [USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models☆167Updated 4 months ago
- A curated list of trustworthy Generative AI papers. Daily updating...☆73Updated 10 months ago
- SaTML'23 paper "Backdoor Attacks on Time Series: A Generative Approach" by Yujing Jiang, Xingjun Ma, Sarah Monazam Erfani, and James Bail…☆18Updated 2 years ago
- [EMNLP 24] Official Implementation of CLEANGEN: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models☆15Updated 4 months ago
- ☆14Updated last year
- Comprehensive Assessment of Trustworthiness in Multimodal Foundation Models☆21Updated 3 months ago
- Fingerprint large language models☆41Updated last year
- official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries☆43Updated last month
- [ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models☆37Updated 11 months ago
- Repository for Towards Codable Watermarking for Large Language Models☆37Updated last year
- Code for NeurIPS 2024 Paper "Fight Back Against Jailbreaking via Prompt Adversarial Tuning"☆14Updated 2 months ago