Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
☆18Aug 22, 2024Updated last year
Alternatives and similar repositories for ferret
Users that are interested in ferret are comparing it to the libraries listed below
Sorting:
- Our EMNLP 2022 paper on VIP-Based Prompting for Parameter-Efficient Learning☆10Oct 22, 2022Updated 3 years ago
- Test LLMs against jailbreaks and unprecedented harms☆40Oct 19, 2024Updated last year
- Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…☆72Mar 3, 2025Updated last year
- ☆21Jul 26, 2025Updated 7 months ago
- ☆29May 22, 2025Updated 9 months ago
- Our EMNLP 2022 paper on MCQA☆23Jan 15, 2023Updated 3 years ago
- ☆22Mar 16, 2023Updated 2 years ago
- This repository contains the dataset and the pytorch implementations of the models from the paper CIDER: Commonsense Inference for Dialog…☆27Oct 30, 2022Updated 3 years ago
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆31Mar 20, 2025Updated 11 months ago
- ☆28Oct 14, 2021Updated 4 years ago
- Restore safety in fine-tuned language models through task arithmetic☆32Mar 28, 2024Updated last year
- Study and research with your docs, media, and AI in one place☆33Updated this week
- Plan✕ is a platform for creating and publishing digital planning services☆17Updated this week
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- A re-implementation of the "Red Teaming Language Models with Language Models" paper by Perez et al., 2022☆35Oct 9, 2023Updated 2 years ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆72May 22, 2025Updated 9 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆41Jan 29, 2026Updated last month
- This repository contains the Parasol processor, which enables next-generation privacy preserving applications. Users can run arbitrary co…☆11Updated this week
- ☆13Nov 5, 2024Updated last year
- ☆39Apr 15, 2024Updated last year
- ☆56May 21, 2025Updated 9 months ago
- Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizX…☆88Mar 15, 2024Updated last year
- DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling☆36Jul 12, 2024Updated last year
- EmotionCircuits-LLM: A complete, reproducible framework for discovering and controlling emotion circuits in large language models.☆25Oct 20, 2025Updated 4 months ago
- Lateral Inhibition-Inspired Convolutional Neural Network for Visual Attention and Saliency Detection☆13Nov 6, 2020Updated 5 years ago
- Identification of the Adversary from a Single Adversarial Example (ICML 2023)☆10Jul 15, 2024Updated last year
- An implementation of MSSRM method☆11Mar 23, 2023Updated 2 years ago
- Precision Knowledge Editing (PKE): A novel method to reduce toxicity in LLMs while preserving performance, with robust evaluations and ha…☆11Nov 26, 2024Updated last year
- ☆24Feb 18, 2026Updated last week
- ☆16May 13, 2021Updated 4 years ago
- ☆16Jan 16, 2025Updated last year
- 2020湖南省第一届人工智能大赛参赛作品☆11Feb 17, 2022Updated 4 years ago
- Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech und…☆42Mar 12, 2023Updated 2 years ago
- On the Robustness of GUI Grounding Models Against Image Attacks☆12Apr 8, 2025Updated 10 months ago
- yolo目标检测算法☆15Jul 27, 2025Updated 7 months ago
- ☆14May 1, 2023Updated 2 years ago
- An simplest PE parser, which list all import and export entries☆12Oct 11, 2018Updated 7 years ago
- Python platform for parallel Surrogate-Based Optimization☆12Nov 27, 2024Updated last year
- Machine Learning for Mathematical Formalization☆11Jul 20, 2024Updated last year