A curated list of trustworthy Generative AI papers. Daily updating...
☆75Sep 4, 2024Updated last year
Alternatives and similar repositories for Awesome-Foundation-Model-Security
Users that are interested in Awesome-Foundation-Model-Security are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"☆87Nov 28, 2023Updated 2 years ago
- ☆16Jul 25, 2022Updated 3 years ago
- Mostly recording papers about models' trustworthy applications. Intending to include topics like model evaluation & analysis, security, c…☆21May 30, 2023Updated 2 years ago
- A list of papers in NeurIPS 2022 related to adversarial attack and defense / AI security.☆75Dec 5, 2022Updated 3 years ago
- [ICLR2025] Detecting Backdoor Samples in Contrastive Language Image Pretraining☆19Feb 26, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Anti-Backdoor learning (NeurIPS 2021)☆84Jul 20, 2023Updated 2 years ago
- ☆16Feb 23, 2025Updated last year
- Differential Privacy Guide☆20Jan 9, 2022Updated 4 years ago
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆35Oct 23, 2024Updated last year
- ☆37Oct 2, 2024Updated last year
- A curated list of academic events on AI Security & Privacy☆168Aug 22, 2024Updated last year
- Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models☆269May 13, 2024Updated last year
- A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them (no longer maintained)☆287Jan 11, 2025Updated last year
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆118Mar 26, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- A re-implementation of the "Extracting Training Data from Large Language Models" paper by Carlini et al., 2020☆39Jul 10, 2022Updated 3 years ago
- A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).☆1,911Mar 16, 2026Updated last week
- ☆12Jan 14, 2026Updated 2 months ago
- ☆19Mar 9, 2024Updated 2 years ago
- ☆20Dec 14, 2024Updated last year
- Proof-of-concept implementation for the paper "ThermalScope: A Practical Interrupt Side Channel Attack Based On Thermal Event Interrupts"…☆13Dec 17, 2024Updated last year
- ☆59Mar 9, 2023Updated 3 years ago
- [CVPR 2023] T-SEA: Transfer-based Self-Ensemble Attack on Object Detection☆117Oct 11, 2024Updated last year
- ☆10Oct 31, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A curated list of Meachine learning Security & Privacy papers published in security top-4 conferences (IEEE S&P, ACM CCS, USENIX Security…☆340Nov 11, 2025Updated 4 months ago
- Pytorch implementation for the pilot study on the robustness of latent diffusion models.☆12Jun 20, 2023Updated 2 years ago
- Code for "Prior Convictions: Black-box Adversarial Attacks with Bandits and Priors"☆13Sep 27, 2018Updated 7 years ago
- Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"☆65Apr 24, 2024Updated last year
- ☆16Feb 23, 2025Updated last year
- Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM☆39Jan 17, 2025Updated last year
- List of Papers on Attack and Defense (AD) in AI Models☆27Mar 18, 2022Updated 4 years ago
- The code of "NeurJudge: A Circumstance-aware Neural Framework for Legal Judgment Prediction"(SIGIR2021))☆17Jan 3, 2024Updated 2 years ago
- Proof-of-concept implementation for the paper "SegScope: Probing Fine-grained Interrupts via Architectural Footprints" (HPCA'24)☆19Mar 12, 2026Updated 2 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A curation of awesome tools, documents and projects about LLM Security.☆1,554Aug 20, 2025Updated 7 months ago
- From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks☆14Feb 23, 2023Updated 3 years ago
- Official codebase for Image Hijacks: Adversarial Images can Control Generative Models at Runtime☆53Sep 19, 2023Updated 2 years ago
- ☆371Jan 4, 2026Updated 2 months ago
- Code for paper "Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators"☆16Dec 4, 2024Updated last year
- ☆13Jul 8, 2020Updated 5 years ago
- code of paper "IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Gene…☆35May 23, 2024Updated last year