usail-hkust / Jailjudge
JAILJUDGE: A comprehensive evaluation benchmark which includes a wide range of risk scenarios with complex malicious prompts (e.g., synthetic, adversarial, in-the-wild, and multi-language scenarios, etc.) along with high-quality human- annotated test datasets.
☆22Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for Jailjudge
- Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)☆81Updated 2 weeks ago
- SEA is an automated paper review framework capable of generating comprehensive and high-quality review feedback with high consistency for…☆47Updated last month
- ☆33Updated 2 weeks ago
- UniGen: A Unified Framework for Dataset Generation via Large Language Model☆28Updated last month
- [ICML2024] "LLaGA: Large Language and Graph Assistant", Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, Zhangyang Wang☆79Updated 2 months ago
- This is the repo for the survey of Bias and Fairness in IR with LLMs.☆41Updated 2 weeks ago
- Official repository of "Can Language Models Solve Graph Problems in Natural Language?". NeurIPS 2023 (Spotlight)☆114Updated 2 months ago
- [NeurIPS 2024] Official implementation for paper "Can Graph Learning Improve Planning in LLM-based Agents?"☆80Updated 2 weeks ago
- Code for `Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum`☆17Updated 7 months ago
- ☆16Updated 3 weeks ago
- A list of awesome papers on LLM tool learning.☆18Updated 3 months ago
- Repo of "Large Language Model-based Human-Agent Collaboration for Complex Task Solving(EMNLP2024 Findings)"☆22Updated last month
- Data and code for "Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style Attacks" (KDD 2024)☆20Updated 2 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆29Updated 2 weeks ago
- Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"☆10Updated last month
- ☆33Updated 3 weeks ago
- Implementation of the MATRIX framework (ICML 2024)☆39Updated 6 months ago
- UrbanKGent is an urban knowledge graph construction agent.☆26Updated 3 weeks ago
- ☆12Updated 8 months ago
- A framework to empover LLMs on graph reasoning and generation. Refer to our paper: https://arxiv.org/pdf/2402.08785.pdf☆70Updated 3 months ago
- Official Code for Paper: Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆58Updated last month
- Code for "Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models", ICLR 2024 Oral.☆20Updated 6 months ago
- Official implementation of paper "Efficient Tuning and Inference for Large Language Models on Textual Graphs"☆23Updated 4 months ago
- [NeurIPS 2024] The implementation of paper "On Softmax Direct Preference Optimization for Recommendation"☆33Updated this week
- ☆61Updated 3 months ago
- Official Implementation of ICLR 2024 paper "Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representat…☆180Updated 5 months ago
- AdaICL: Which Examples to Annotate of In-Context Learning? Towards Effective and Efficient Selection☆16Updated last year
- ☆25Updated last year
- ☆12Updated 8 months ago
- Implementation of GraphPrompter (The Web Conference 2024 Short Paper)☆19Updated 7 months ago