A Domain-Specific Language, Jailbreak Attack Synthesizer and Dynamic LLM Redteaming Toolkit
☆27Dec 5, 2024Updated last year
Alternatives and similar repositories for h4rm3l
Users that are interested in h4rm3l are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Nov 8, 2022Updated 3 years ago
- A Dataset and Results for Classifying Emotions Across Languages☆10Jun 20, 2021Updated 4 years ago
- [CVPR 2025] Official implementation for JOOD "Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy"☆22Jun 11, 2025Updated 10 months ago
- ☆46Apr 29, 2025Updated last year
- ☆16Feb 4, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- This repository contains the data and code for the paper "SideControl: Controlled Open-domain Dialogue Generation via Additive Side Netwo…☆12Dec 1, 2021Updated 4 years ago
- CRFs based Chinese word segmentor☆21Oct 8, 2014Updated 11 years ago
- AIR-Bench 2024 is a safety benchmark that aligns with emerging government regulations and company policies☆30Aug 14, 2024Updated last year
- ☆10Jun 8, 2024Updated last year
- Context-based Dialogue Act Recognition using Recurrent Neural Networks☆13Nov 13, 2021Updated 4 years ago
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆133Feb 24, 2025Updated last year
- ☆23Apr 5, 2023Updated 3 years ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆21Apr 8, 2025Updated last year
- ☆16Sep 12, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder" in NMI.☆57Nov 13, 2023Updated 2 years ago
- Code for "End-to-End Learning of Flowchart Grounded Task-Oriented Dialogs"☆14Oct 10, 2022Updated 3 years ago
- [ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion☆59Oct 1, 2025Updated 7 months ago
- Chat with any codebase with MCP servers in a single command☆13May 28, 2025Updated 11 months ago
- Official Repository for EvalRS @ KDD 2023: a Rounded Evaluation of Recommender Systems☆30Feb 16, 2024Updated 2 years ago
- Refactored libsvm for easily plugging custom kernels (e.g., Tree Kernel) written in Java.☆23Oct 18, 2013Updated 12 years ago
- A simple Streamlit application to visualize document chunks and queries in embedding space 🗺️🔍☆13Apr 15, 2025Updated last year
- This repository forked from parlAI. Korean Wizard of Wikipedia task was added to this repo. This repository is going to be moved after EM…☆16Dec 9, 2022Updated 3 years ago
- EMNLP 2020: Personalized Dialog Generation with Commonsense☆18Oct 12, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Docker + CVE-2015-2925 = escaping from --volume☆11Jun 30, 2015Updated 10 years ago
- ICLR 2022☆18Apr 15, 2022Updated 4 years ago
- NeurIPS'24 - LLM Safety Landscape☆40Oct 21, 2025Updated 6 months ago
- Dataset proposed by ''How to Write Summaries with Patterns? Learning towards Abstractive Summarization through Prototype Editing''☆18May 4, 2021Updated 5 years ago
- Evaluating Multimodal Generative AI with Korean Educational Standards, NAACL 2025.☆26May 15, 2025Updated 11 months ago
- Prof. Chen's teaching material.☆10Feb 3, 2023Updated 3 years ago
- Project page for "Neural Argument Generation Augmented with Externally Retrieved Evidence"☆21Apr 24, 2022Updated 4 years ago
- Code for the paper "HALoGEN: Fantastic LLM Hallucinations and Where To Find Them"☆25May 18, 2025Updated 11 months ago
- ☆22Dec 8, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Restore safety in fine-tuned language models through task arithmetic☆32Mar 28, 2024Updated 2 years ago
- Topic Detection and Tracking☆19Apr 21, 2015Updated 11 years ago
- ☆18Mar 2, 2026Updated 2 months ago
- [ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"☆25Mar 28, 2024Updated 2 years ago
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique☆19Aug 22, 2024Updated last year
- ☆10Oct 20, 2023Updated 2 years ago
- ☆19Jun 11, 2018Updated 7 years ago