usail-hkust / Jailjudge
JAILJUDGE: A comprehensive evaluation benchmark which includes a wide range of risk scenarios with complex malicious prompts (e.g., synthetic, adversarial, in-the-wild, and multi-language scenarios, etc.) along with high-quality human- annotated test datasets.
☆23Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for Jailjudge
- Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)☆84Updated 3 weeks ago
- This is the repo for the survey of Bias and Fairness in IR with LLMs.☆42Updated 3 weeks ago
- SEA is an automated paper review framework capable of generating comprehensive and high-quality review feedback with high consistency for…☆50Updated last month
- ☆39Updated last month
- ☆39Updated 3 weeks ago
- FedJudge: Federated Legal Large Language Model☆31Updated 2 months ago
- [ICML2024] "LLaGA: Large Language and Graph Assistant", Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, Zhangyang Wang☆82Updated 2 months ago
- [NeurIPS 2024] Official implementation for paper "Can Graph Learning Improve Planning in LLM-based Agents?"☆84Updated last week
- Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"☆11Updated last month
- ☆18Updated last year
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆86Updated 2 months ago
- ☆24Updated 8 months ago
- ☆29Updated last year
- Official implementation of paper "Efficient Tuning and Inference for Large Language Models on Textual Graphs"☆23Updated 4 months ago
- ☆12Updated 9 months ago
- The official implementation of the paper "AgentSquare: Automatic LLM Agent Search in Modular Design Space""☆129Updated this week
- Implementation of the MATRIX framework (ICML 2024)☆39Updated 6 months ago
- ☆31Updated 5 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆31Updated 3 weeks ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆120Updated last week
- Repo of "Large Language Model-based Human-Agent Collaboration for Complex Task Solving(EMNLP2024 Findings)"☆22Updated 2 months ago
- Code for "Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models", ICLR 2024 Oral.☆20Updated 7 months ago
- ☆12Updated 8 months ago
- Implementation of our CIKM'2024 paper "Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering"☆20Updated last month
- [NeurIPS'24] The official implementation code of LLM-ESR.☆14Updated 4 months ago
- [KDD'2024] "HiGPT: Heterogenous Graph Language Models"☆109Updated 5 months ago
- Official repository of "Can Language Models Solve Graph Problems in Natural Language?". NeurIPS 2023 (Spotlight)☆115Updated 3 months ago
- A framework to empover LLMs on graph reasoning and generation. Refer to our paper: https://arxiv.org/pdf/2402.08785.pdf☆72Updated 3 months ago
- Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)☆48Updated 3 months ago
- Code for "Learning to Edit: Aligning LLMs with Knowledge Editing (ACL 2024)"☆27Updated 3 months ago