hbseong97 / HarmAugLinks
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
☆13Updated 10 months ago
Alternatives and similar repositories for HarmAug
Users that are interested in HarmAug are comparing it to the libraries listed below
Sorting:
- About Official PyTorch implementation of "Query-Efficient Black-Box Red Teaming via Bayesian Optimization" (ACL'23)☆15Updated 2 years ago
- ☆11Updated 2 years ago
- ☆24Updated 2 years ago
- ☆20Updated 2 years ago
- The git repository of Modular Prompted Chatbot paper☆35Updated 2 years ago
- Evaluating Multimodal Generative AI with Korean Educational Standards, NAACL 2025.☆24Updated 8 months ago
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆43Updated last year
- Official PyTorch implementation of "Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data" (NeurIPS'23)☆15Updated 2 years ago
- This repository contains the official code for the paper: "Prompt Injection: Parameterization of Fixed Inputs"☆32Updated last year
- ☆14Updated 3 years ago
- CareCall for Seniors: Role Specified Open-Domain Dialogue dataset generated by leveraging LLMs (NAACL 2022).☆60Updated 3 years ago
- 🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Con…☆50Updated 2 years ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆58Updated last year
- ☆33Updated 2 months ago
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Models☆15Updated 2 years ago
- KAIST AI605 Deep Learning for NLP☆31Updated 3 years ago
- ☆27Updated last year
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"☆12Updated 10 months ago
- [ICLR 2022] Towards Continual Knowledge Learning of Language Models☆92Updated 3 years ago
- These are papers that I read and reviewed related to NLP, CV, and Deep Learning 😉 You can check paper links and my reviews 😊☆13Updated 2 years ago
- A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.☆36Updated 5 months ago
- ☆32Updated 2 years ago
- [ACL 2021] Learning to Perturb Word Embeddings for Out-of-distribution QA☆16Updated 3 years ago
- ☆19Updated last year
- Generalizable Implicit Hate Speech Detection using Contrastive Learning (COLING 2022)☆14Updated 3 years ago
- CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean☆47Updated last year
- ☆30Updated 3 years ago
- Code for text augmentation method leveraging large-scale language models☆61Updated 4 years ago
- Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-informat…☆16Updated 2 years ago
- Model Stock: All we need is just a few fine-tuned models☆128Updated 5 months ago