Code for the website www.jailbreakchat.com
☆119Aug 26, 2023Updated 2 years ago
Alternatives and similar repositories for jailbreakchat
Users that are interested in jailbreakchat are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Experimental toolbox for quantum Shapley values.☆10Jan 2, 2024Updated 2 years ago
- The official implementation of the paper "Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models" (ICML 2023 Wor…☆22Mar 19, 2024Updated 2 years ago
- [NeurIPS 23] Characterizing OOD Error via Optimal Transport☆13Nov 19, 2023Updated 2 years ago
- [ICML 2025] An official source code for paper "FlipAttack: Jailbreak LLMs via Flipping".☆170May 2, 2025Updated 10 months ago
- ☆34Nov 26, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- [EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156☆49Dec 14, 2023Updated 2 years ago
- Code and data for "ImgTrojan: Jailbreaking Vision-Language Models with ONE Image"☆24Mar 26, 2025Updated last year
- The official implementation of Self-aware Object Detection [CVPR 2023]☆13Jun 30, 2023Updated 2 years ago
- A curated list of explainability-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to…☆53Jun 25, 2025Updated 9 months ago
- Code for NeurIPS 2024 Paper "Fight Back Against Jailbreaking via Prompt Adversarial Tuning"☆22May 6, 2025Updated 10 months ago
- ☆33Jun 24, 2024Updated last year
- This is the available code for the paper `evidential fully convolutional network for semantic segmentation (arXiv preprint arXiv:2103.135…☆14Jun 1, 2022Updated 3 years ago
- [ICCV 2023] HybridAugment++: Unified Frequency Spectra Perturbations for Model Robustness☆17Sep 28, 2023Updated 2 years ago
- ☆711Jul 2, 2025Updated 8 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A system that analyses patient symptoms and provides preliminary diagnoses along with recommended treatments⚕️☆14Jul 28, 2023Updated 2 years ago
- Deep Learning & Information Bottleneck☆64Jun 30, 2023Updated 2 years ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]☆380Jan 23, 2025Updated last year
- Official Implementation of the paper: "A Rate-Distorion View of Uncertainty Quantification", ICML 2024☆29Sep 3, 2024Updated last year
- Code for ICLR 2025 Failures to Find Transferable Image Jailbreaks Between Vision-Language Models☆36Jun 1, 2025Updated 9 months ago
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆14Dec 16, 2024Updated last year
- Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]☆79Jan 23, 2025Updated last year
- Debiasing Through Data Attribution☆13May 23, 2024Updated last year
- [NeurIPS25] RULE: Reinforcement UnLEarning Achieves Forge-retain Pareto Optimality☆20Oct 22, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion☆59Oct 1, 2025Updated 5 months ago
- 针对大语言模型的对抗性攻击总结☆39Dec 22, 2023Updated 2 years ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆113Sep 28, 2024Updated last year
- ⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs☆467Jan 31, 2024Updated 2 years ago
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆89May 14, 2024Updated last year
- A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack–Defense Evaluation☆63Mar 2, 2026Updated 3 weeks ago
- This repository provides a benchmark for prompt injection attacks and defenses in LLMs☆413Oct 29, 2025Updated 5 months ago
- ☆14Mar 23, 2021Updated 5 years ago
- The code implementation for TTCS: Test-Time Curriculum Synthesis for Self-Evolving.☆40Mar 8, 2026Updated 3 weeks ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- [ECCV 2020] Official code for "Comprehensive Image Captioning via Scene Graph Decomposition"☆99Aug 20, 2024Updated last year
- [🏆 IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound …☆28Nov 1, 2025Updated 4 months ago
- Classification of animal sounds in a hyperdiverse rainforest using Convolutional Neural Networks (Sun et al, 2021)☆13Oct 16, 2023Updated 2 years ago
- Implementation of stop sequencer for Huggingface Transformers☆16Jun 6, 2023Updated 2 years ago
- ☆12Jul 14, 2025Updated 8 months ago
- [ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability☆176Dec 18, 2024Updated last year
- Winning Hackathon entry for Streamlit LLM Hackathon October 2023☆16Oct 19, 2023Updated 2 years ago