usail-hkust / JailjudgeView external linksLinks
JAILJUDGE: A comprehensive evaluation benchmark which includes a wide range of risk scenarios with complex malicious prompts (e.g., synthetic, adversarial, in-the-wild, and multi-language scenarios, etc.) along with high-quality human- annotated test datasets.
☆58Dec 13, 2024Updated last year
Alternatives and similar repositories for Jailjudge
Users that are interested in Jailjudge are comparing it to the libraries listed below
Sorting:
- The implementation of Meta-Pec☆12Sep 13, 2023Updated 2 years ago
- ☆20Apr 25, 2023Updated 2 years ago
- Official codes of KDD'24 paper "HiFGL: A Hierarchical Framework for Cross-silo Cross-device Federated Graph Learning"☆10Sep 4, 2024Updated last year
- Official implementation for ICML24 paper "Irregular Multivariate Time Series Forecasting: A Transformable Patching Graph Neural Networks …☆125Nov 28, 2025Updated 2 months ago
- ☆12Feb 19, 2024Updated last year
- An Awesome Collection of Urban Foundation Models (UFMs).☆208Jan 11, 2026Updated last month
- [NeurIPS2025] Foundation Models for Scientific Discovery: From Paradigm Enhancement to Paradigm Transition☆32Oct 21, 2025Updated 3 months ago
- UUKG: Unified Urban Knowledge Graph Dataset for Knowledge-Enhanced Urban Spatiotemporal Prediction☆117Apr 29, 2025Updated 9 months ago
- Official code for article "LLMLight: Large Language Models as Traffic Signal Control Agents".☆273Aug 12, 2025Updated 6 months ago
- ☆14Jun 7, 2024Updated last year
- [ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization☆29Jul 9, 2024Updated last year
- Codes for replicating the dataset described in "A Satellite Imagery Dataset for Long-Term Sustainable Development in United States Cities…☆12Jun 23, 2023Updated 2 years ago
- CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control☆22Mar 21, 2025Updated 10 months ago
- The official implementation of "Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding"☆23Jun 26, 2025Updated 7 months ago
- ☆46Oct 22, 2024Updated last year
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆24Nov 29, 2024Updated last year
- 🎮Manipulates mobile phones just like how you would. Official code for "MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficien…☆27Oct 10, 2025Updated 4 months ago
- Revolve: Optimizing AI Systems by Tracking Response Evolution in Textual Optimization☆21Dec 13, 2024Updated last year
- Reinforcement Learning-based Placement of Charging Stations in Urban Road Networks☆19Feb 13, 2024Updated 2 years ago
- ☆25Nov 19, 2025Updated 2 months ago
- ☆20Oct 25, 2022Updated 3 years ago
- Imaging Tasks with Event Camera☆30Jan 10, 2025Updated last year
- ☆53Apr 9, 2025Updated 10 months ago
- ☆164Sep 2, 2024Updated last year
- General research for Dreadnode☆27Jun 17, 2024Updated last year
- ☆30Oct 18, 2024Updated last year
- To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models☆33May 21, 2025Updated 8 months ago
- [TKDE 2024] Official Code of the paper "Disentangling Structured Components: Towards Adaptive, Interpretable and Scalable Time Series For…☆31Jul 13, 2024Updated last year
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆13Jun 28, 2025Updated 7 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Feb 8, 2026Updated last week
- quick playground to animate pippin☆14Nov 11, 2024Updated last year
- UrbanKGent is an urban knowledge graph construction agent.☆65Oct 14, 2025Updated 4 months ago
- The original Shared Recurrent Memory Transformer implementation☆33Jul 11, 2025Updated 7 months ago
- [AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data☆33Apr 7, 2025Updated 10 months ago
- [ACL'25 Findings] Official repo for "HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Task"☆37Apr 7, 2025Updated 10 months ago
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- Panda Guard is designed for researching jailbreak attacks, defenses, and evaluation algorithms for large language models (LLMs).☆61Jan 19, 2026Updated 3 weeks ago
- A red teaming agent☆18Oct 15, 2025Updated 4 months ago
- ☆18Jun 10, 2025Updated 8 months ago