☆23May 20, 2025Updated 10 months ago
Alternatives and similar repositories for libra-eval
Users that are interested in libra-eval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization☆29Jul 9, 2024Updated last year
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆28Mar 14, 2024Updated 2 years ago
- ☆22Aug 8, 2025Updated 8 months ago
- Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs☆12Nov 7, 2024Updated last year
- ☆22Dec 18, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Papers about red teaming LLMs and Multimodal models.☆160May 28, 2025Updated 10 months ago
- Links to publications that focus on the interpretation and analysis of in-context learning☆15Oct 17, 2024Updated last year
- dotEngine android sdk -- 跨平台音视频实时通信sdk☆14Jun 11, 2019Updated 6 years ago
- ☆12Oct 23, 2022Updated 3 years ago
- [NeurIPS 2024] Fight Back Against Jailbreaking via Prompt Adversarial Tuning☆11Oct 29, 2024Updated last year
- Our research proposes a novel MoGU framework that improves LLMs' safety while preserving their usability.☆18Jan 14, 2025Updated last year
- LLM手撕代码合集☆21Mar 25, 2025Updated last year
- Analyzing LLM Alignment via Token distribution shift☆18Jan 26, 2024Updated 2 years ago
- ☆11Oct 11, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official implementation of the paper “Reconsidering Overthinking: Penalizing Internal and External Redundancy in CoT Reasoning”☆20Aug 20, 2025Updated 7 months ago
- LLM benchmarks☆13Feb 22, 2024Updated 2 years ago
- Code for NeurIPS 2024 Paper "Fight Back Against Jailbreaking via Prompt Adversarial Tuning"☆22May 6, 2025Updated 11 months ago
- ☆14Feb 26, 2025Updated last year
- Complete set of English dialect transformation rules and evaluation code☆16Jun 7, 2024Updated last year
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆21Mar 7, 2024Updated 2 years ago
- ☆11Mar 24, 2023Updated 3 years ago
- Fleming-R1: Toward Expert-Level Medical Reasoning via Reinforcement Learning☆31Sep 29, 2025Updated 6 months ago
- UCAS大三自然语言处理课程大作业☆12Jun 25, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Lang…☆22Jul 3, 2024Updated last year
- ☆12Apr 18, 2019Updated 6 years ago
- [ACL 25] SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities☆29Apr 2, 2025Updated last year
- ☆15Mar 6, 2026Updated last month
- What I'm learning/practicing☆18Feb 14, 2026Updated 2 months ago
- Data augmentation using OpenCV☆11Jan 12, 2017Updated 9 years ago
- ☆13Jan 14, 2025Updated last year
- 供大学生,竞赛生,高中生查找的math-wiki☆10May 26, 2022Updated 3 years ago
- ☆51Mar 31, 2026Updated 2 weeks ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- This repository provides the official implementation of QSVD, a method for efficient low-rank approximation that unifies Query-Key-Value …☆26Dec 1, 2025Updated 4 months ago
- prototyping stuff☆14Aug 17, 2025Updated 7 months ago
- Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…☆24Jul 26, 2024Updated last year
- Röttger et al. (2025): "MSTS: A Multimodal Safety Test Suite for Vision-Language Models"☆16Mar 31, 2025Updated last year
- Garnet - a graphical toolkit for Lisp☆20Dec 30, 2021Updated 4 years ago
- ☆129Nov 13, 2023Updated 2 years ago
- Logging in with Scrapy☆14Jan 26, 2018Updated 8 years ago