A Domain-Specific Language, Jailbreak Attack Synthesizer and Dynamic LLM Redteaming Toolkit
☆27Dec 5, 2024Updated last year
Alternatives and similar repositories for h4rm3l
Users that are interested in h4rm3l are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Nov 8, 2022Updated 3 years ago
- A Dataset and Results for Classifying Emotions Across Languages☆10Jun 20, 2021Updated 4 years ago
- [CVPR 2025] Official implementation for JOOD "Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy"☆21Jun 11, 2025Updated last year
- Belief in the Machine: Investigating Epistemological Blind Spots of Language Models☆34Apr 19, 2025Updated last year
- ☆17Feb 4, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆14Jun 8, 2018Updated 8 years ago
- [NAACL'25] RuleR: Improving LLM Controllability by Rule-based Data Recycling☆14Sep 27, 2025Updated 8 months ago
- Automatically modelling and distilling knowledge within AI. In other words, summarising the AI research firehose.☆25Mar 15, 2019Updated 7 years ago
- This repository contains the data and code for the paper "SideControl: Controlled Open-domain Dialogue Generation via Additive Side Netwo…☆12Dec 1, 2021Updated 4 years ago
- AIR-Bench 2024 is a safety benchmark that aligns with emerging government regulations and company policies☆30Aug 14, 2024Updated last year
- Context-based Dialogue Act Recognition using Recurrent Neural Networks☆13Nov 13, 2021Updated 4 years ago
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆136Feb 24, 2025Updated last year
- ☆23Apr 5, 2023Updated 3 years ago
- Code for our ACL19 paper on argument generation☆14Nov 9, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆21Apr 8, 2025Updated last year
- ☆25Jun 2, 2026Updated 2 weeks ago
- Code for the ACL 2024 paper "PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning"☆14Aug 13, 2025Updated 10 months ago
- Shared code for training sentence embeddings with Flax / JAX☆28Jul 15, 2021Updated 4 years ago
- quica is a tool to run inter coder agreement pipelines in an easy and effective ways. Multiple measures are run and results are collected…☆23Nov 9, 2020Updated 5 years ago
- [ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion☆61Oct 1, 2025Updated 8 months ago
- Chat with any codebase with MCP servers in a single command☆13May 28, 2025Updated last year
- Implementation of "Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions"☆24Aug 27, 2024Updated last year
- Influence Estimation for Gradient-Boosted Decision Trees☆29May 27, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- This repository forked from parlAI. Korean Wizard of Wikipedia task was added to this repo. This repository is going to be moved after EM…☆16Dec 9, 2022Updated 3 years ago
- Docker + CVE-2015-2925 = escaping from --volume☆11Jun 30, 2015Updated 10 years ago
- [NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey☆111Aug 7, 2024Updated last year
- NeurIPS'24 - LLM Safety Landscape☆40Oct 21, 2025Updated 7 months ago
- Evaluating Multimodal Generative AI with Korean Educational Standards, NAACL 2025.☆26May 15, 2025Updated last year
- Tiny evaluation of leading LLMs on competitive programming problems☆14Apr 10, 2026Updated 2 months ago
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆38Apr 1, 2025Updated last year
- Project page for "Neural Argument Generation Augmented with Externally Retrieved Evidence"☆21Apr 24, 2022Updated 4 years ago
- Code for the paper "HALoGEN: Fantastic LLM Hallucinations and Where To Find Them"☆25May 18, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆22Dec 8, 2022Updated 3 years ago
- Restore safety in fine-tuned language models through task arithmetic☆33Mar 28, 2024Updated 2 years ago
- ☆18Mar 2, 2026Updated 3 months ago
- Reference implementation of algorithms for reinforcement learning and Markov decision processes.☆12Jan 28, 2021Updated 5 years ago
- [ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"☆25Mar 28, 2024Updated 2 years ago
- ☆10May 21, 2026Updated 3 weeks ago
- FastAPI app that uses OpenAI APIs to stream responses☆19Jun 27, 2024Updated last year