AIR-Bench 2024 is a safety benchmark that aligns with emerging government regulations and company policies
☆29Aug 14, 2024Updated last year
Alternatives and similar repositories for air-bench-2024
Users that are interested in air-bench-2024 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SEAT☆21Oct 10, 2023Updated 2 years ago
- ☆24Mar 1, 2025Updated last year
- Official implementation of Tabular Transfer Learning via Prompting LLMs (COLM 2024).☆13Aug 6, 2024Updated last year
- Aioli: A unified optimization framework for language model data mixing☆32Jan 17, 2025Updated last year
- ICCV 2023 - AdaptGuard: Defending Against Universal Attacks for Model Adaptation☆11Dec 23, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆10Jul 3, 2024Updated last year
- Generate custom text files for dataloader within UDA methods☆14May 24, 2023Updated 2 years ago
- ☆14Mar 4, 2024Updated 2 years ago
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/☆26Mar 10, 2025Updated last year
- ☆13May 10, 2025Updated 10 months ago
- ☆79Nov 17, 2025Updated 4 months ago
- ☆32Jul 8, 2024Updated last year
- Ranking-Consistent Language-Image Pretraining☆12Oct 24, 2025Updated 5 months ago
- Deep Learning - Visual Representation Learning by solving Jigsaw puzzles using Deep Reinforcement Learning☆10Dec 8, 2016Updated 9 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [ICLR 2024] Towards Elminating Hard Label Constraints in Gradient Inverision Attacks☆14Feb 6, 2024Updated 2 years ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆53Jul 15, 2025Updated 8 months ago
- Active Learning Helps Pretrained Models Learn the Intended Task (https://arxiv.org/abs/2204.08491) by Alex Tamkin, Dat Nguyen, Salil Desh…☆11Nov 22, 2022Updated 3 years ago
- A Domain-Specific Language, Jailbreak Attack Synthesizer and Dynamic LLM Redteaming Toolkit☆27Dec 5, 2024Updated last year
- ☆33Feb 17, 2026Updated last month
- ☆37Oct 2, 2024Updated last year
- ☆128Nov 13, 2023Updated 2 years ago
- Code and data repository for "The Mirage of Model Editing: Revisiting Evaluation in the Wild"☆16Aug 27, 2025Updated 7 months ago
- A Benchmark for Evaluating Safety and Trustworthiness in Web Agents for Enterprise Scenarios☆21Mar 12, 2026Updated 2 weeks ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for CVPR2018 "Iterative Learning with Open-set Noisy Labels"☆12Mar 12, 2021Updated 5 years ago
- ☆18Oct 7, 2022Updated 3 years ago
- [ICML 2022 Spotlight] Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks☆11May 21, 2023Updated 2 years ago
- Let there be clock in the beach - WACV 2022☆15Nov 15, 2021Updated 4 years ago
- CVPR 2025 - R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning☆22Aug 28, 2025Updated 7 months ago
- OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift. ICML 2024 and ICLRW-DMLR 2024☆23Jul 25, 2024Updated last year
- ☆10Sep 29, 2023Updated 2 years ago
- [ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations☆15Oct 28, 2023Updated 2 years ago
- ⚖️ Code for the paper "Ethical Adversaries: Towards Mitigating Unfairness with Adversarial Machine Learning".☆11Dec 8, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- MIRROR of https://codeberg.org/catseye/Samovar : Model worlds using propositions and run simulations in them☆18Dec 8, 2023Updated 2 years ago
- Text generation using language models with multiple exit heads☆16Sep 18, 2025Updated 6 months ago
- NN 2023☆23Nov 9, 2022Updated 3 years ago
- ☆18Oct 12, 2022Updated 3 years ago
- MV-RAG combines retrieval with multi-view generation to create accurate 3D-consistent visuals. By retrieving reference images and text, i…☆24Nov 29, 2025Updated 4 months ago
- Benchmark evaluation code for "SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal" (ICLR 2025)☆77Mar 1, 2025Updated last year
- Code implementation of R^2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning☆22Jul 8, 2024Updated last year