About Official PyTorch implementation of "Query-Efficient Black-Box Red Teaming via Bayesian Optimization" (ACL'23)
☆15Jul 9, 2023Updated 2 years ago
Alternatives and similar repositories for Bayesian-Red-Teaming
Users that are interested in Bayesian-Red-Teaming are comparing it to the libraries listed below
Sorting:
- ☆15Nov 22, 2023Updated 2 years ago
- Question-Directed Graph Attention Network for Numerical Reasoning over Text☆10Aug 14, 2020Updated 5 years ago
- Official PyTorch implementation of "Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian O…☆25Sep 26, 2023Updated 2 years ago
- Effective Unsupervised Domain Adaptation of Neural Rankers by Diversifying Synthetic Query Generation☆15Apr 23, 2025Updated 10 months ago
- This is official project in our paper: Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers☆31Jan 13, 2024Updated 2 years ago
- Official github repository of CriticControl☆28Aug 6, 2023Updated 2 years ago
- All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks☆18Apr 24, 2024Updated last year
- On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning, …☆19Dec 16, 2024Updated last year
- Revisiting Character-level Adversarial Attacks for Language Models, ICML 2024☆19Feb 12, 2025Updated last year
- Official code and dataset repository of KoBBQ (TACL 2024)☆19May 13, 2024Updated last year
- [NAACL 2024] Official repository for "KTRL+F: Knowledge-Augmented In-Document Search"☆23Oct 11, 2024Updated last year
- Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs. EMNLP 2024☆27Nov 13, 2024Updated last year
- Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confid…☆23May 8, 2023Updated 2 years ago
- ☆24Dec 2, 2023Updated 2 years ago
- A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack–Defense Evaluation☆59Updated this week
- Emoji Attack [ICML 2025]☆41Jul 15, 2025Updated 7 months ago
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆29Sep 12, 2024Updated last year
- ☆21Mar 17, 2025Updated 11 months ago
- ☆60Mar 9, 2023Updated 2 years ago
- This repository contains the official code for the paper: "Prompt Injection: Parameterization of Fixed Inputs"☆32Sep 13, 2024Updated last year
- GenRM-CoT: Data release for verification rationales☆68Oct 16, 2024Updated last year
- [NeurIPS 2024] Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling☆34Nov 8, 2024Updated last year
- Awesome Triton Resources☆39Apr 27, 2025Updated 10 months ago
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆29Sep 27, 2024Updated last year
- code for COLING paper "A Hybrid Model of Classification and Generation for Spatial Relation Extraction"☆10Oct 20, 2022Updated 3 years ago
- Codebase for fine-tuning Llama2 70B to generate math test questions and answers.☆11Aug 30, 2024Updated last year
- See also APPL: https://github.com/appl-team/appl that improves this project. A Python package for writing Language Models prompts in a ne…☆36Oct 2, 2023Updated 2 years ago
- Test LLMs against jailbreaks and unprecedented harms☆40Oct 19, 2024Updated last year
- ☆12May 6, 2022Updated 3 years ago
- Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".☆12Oct 14, 2024Updated last year
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 5 months ago
- ☆11Dec 23, 2024Updated last year
- Concurrency library☆17Oct 13, 2024Updated last year
- Interpretable ML for TabPFN☆47Jul 13, 2025Updated 7 months ago
- Implementation of the paper "MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation".☆31Dec 12, 2021Updated 4 years ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆32Jun 13, 2024Updated last year
- Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)☆40Aug 28, 2023Updated 2 years ago
- [AAAI2024] An official pytorch implement of the paper: Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Underst…☆13Dec 8, 2024Updated last year
- An active inference model of Lacanian psychoanalysis☆15Jun 7, 2025Updated 8 months ago