Code for "Goodtriever: Toxicity Mitigation with Retrieval-augmented Language Models"
☆25May 30, 2024Updated last year
Alternatives and similar repositories for goodtriever
Users that are interested in goodtriever are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets☆10Dec 12, 2023Updated 2 years ago
- A gentle introduction to Conflict-free replicated data types, including visual demos☆17Jan 14, 2022Updated 4 years ago
- Complete set of English dialect transformation rules and evaluation code☆16Jun 7, 2024Updated last year
- ☆30Aug 9, 2023Updated 2 years ago
- Code for paper Target-Guided Dialogue Response Generation Using Commonsense and Data Augmentation☆14Jun 10, 2022Updated 3 years ago
- [Findings of EMNLP 2022] Code of paper Generative Prompt Tuning for Relation Classification. https://arxiv.org/abs/2210.12435☆20May 7, 2023Updated 2 years ago
- Code and data for paper "Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?". (ACL 2025 Main)☆22Jun 18, 2025Updated 9 months ago
- An empathetic counselling chatbot. Retrieval-based, uses finetuned LMs for emotion identification and to boost empathy, novelty and fluen…☆17Jun 8, 2023Updated 2 years ago
- code for our EACL 2021 paper: "Challenges in Automated Debiasing for Toxic Language Detection" by Xuhui Zhou, Maarten Sap, Swabha Swayamd…☆19Aug 20, 2021Updated 4 years ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- ☆17Aug 2, 2023Updated 2 years ago
- Neuron Activation☆26Nov 21, 2024Updated last year
- Search engine results page scraper☆13Dec 19, 2018Updated 7 years ago
- ☆10Sep 17, 2022Updated 3 years ago
- The source code of the paper 'Dynamic Knowledge Routing Network For Target-Guided Open-Domain Conversation'☆24Mar 24, 2023Updated 3 years ago
- ☆15Mar 20, 2025Updated last year
- Fortifying Toxic Speech Detectors Against Veiled Toxicity☆11Oct 21, 2020Updated 5 years ago
- Collection of academic and pseudo-academic events, publications and web sites that spam me☆21Mar 16, 2026Updated last week
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- Investigating and Defending Shortcut Learning in Personalized Diffusion Models☆13Nov 19, 2024Updated last year
- Code for our WOAH@ACL 2021 Paper on Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in …☆30Nov 25, 2021Updated 4 years ago
- Global ASP - African Storybook Project for the World☆18Dec 1, 2025Updated 3 months ago
- ☆11Apr 13, 2023Updated 2 years ago
- ☆12Oct 20, 2020Updated 5 years ago
- [Work in progress] A reading list for machine commonsense reasoning☆34Apr 14, 2020Updated 5 years ago
- ☆14Jul 23, 2023Updated 2 years ago
- an NLP project that aims to identify individuals at risk of suicide.☆21Mar 14, 2020Updated 6 years ago
- Replication code for "The Structure of Toxic Conversations on Twitter" (WWW'21)☆10May 25, 2021Updated 4 years ago
- Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!☆11Oct 16, 2024Updated last year
- Scripts to evaluate various bias metrics for different NLG models + decoding algorithms☆16Dec 6, 2023Updated 2 years ago
- TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes☆14Jul 1, 2025Updated 8 months ago
- Code for the WWW'23 paper "Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy"☆12Feb 20, 2023Updated 3 years ago
- ☆16Jul 17, 2025Updated 8 months ago
- This is the code for neural-Jacana aligner, and the data for MultiMWA dataset.☆20Feb 12, 2023Updated 3 years ago
- ☆11Oct 16, 2023Updated 2 years ago
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆14Dec 16, 2024Updated last year
- Pandora is a Therapeutic AI Assistant designed to act as a therapist in order to provide emotional support to people with anxiety & depre…☆27May 8, 2023Updated 2 years ago
- CS194-196 Course Project☆14Feb 20, 2025Updated last year