TrustAIRLab / JailbreakRadarLinks
☆84Updated 6 months ago
Alternatives and similar repositories for JailbreakRadar
Users that are interested in JailbreakRadar are comparing it to the libraries listed below
Sorting:
- StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving☆21Updated last year
- [EMNLP 2024 Findings] Official PyTorch Implementation of "Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Ge…☆41Updated 10 months ago
- The official code repo for "Safe Delta: Consistently Preserving Safety when Fine-Tuning LLMs on Diverse Datasets" in ICML 2025.☆56Updated 6 months ago
- Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward☆42Updated last month
- [NeurIPS 25 @ ER] Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs☆73Updated last month
- ☆62Updated last year
- ☆75Updated last year
- Code for ACL 2024 long paper: Are AI-Generated Text Detectors Robust to Adversarial Perturbations?☆32Updated last year
- [ACL 2023 findings] Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization☆17Updated 2 years ago
- An open-source highly heterogeneous entity alignment (HHEA) toolkit.☆32Updated last year
- ☆36Updated last year
- GLT has presented the first attempt to accelerate GNN inference. Though promising, GLT encounters robustness and generalization issues wh…☆28Updated last year
- ☆73Updated last year
- A comprehensive collection of resources focused on addressing and understanding hallucination phenomena in MLLMs.☆35Updated last year
- Concise Evaluation Benchmark for Large Language Models☆25Updated 5 months ago
- Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering☆42Updated last month
- Code of Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Ne…☆28Updated last year
- [ICME 2024] Official Datasets and example of LLM-SAP: Large Language Model Situational Awareness Based Planning☆33Updated 9 months ago
- [ICLR 2025] Official implementation of paper "Improving Data Efficiency via Curating LLM-Driven Rating Systems"☆100Updated 9 months ago
- AutoRLAIF is a cutting-edge framework designed to revolutionize the fine-tuning of large language models through Reinforcement Learning …☆95Updated last year
- ☆49Updated 2 years ago
- alsap_frontend☆63Updated 10 months ago
- Multi-Attentional Deepfake Detection☆22Updated last year
- mobile predict☆25Updated last year
- This script monitors the remaining traffic of VMs on Vultr, DigitalOcean, and Linode. If the remaining traffic is zero, it shuts down the…☆33Updated last year
- 低代码核心组件:数据模型的实现☆56Updated last year
- NLP自学仓库☆24Updated last year
- Official Code of Logits-Based-Finetuning☆91Updated 6 months ago
- A system demo based on Retrival Argument Generation to answer buddism question☆84Updated last year
- A Contextual RAG Bot Framework☆82Updated last year