swj0419/muse_bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/swj0419/muse_bench)

swj0419 / muse_bench

☆34

Alternatives and similar repositories for muse_bench

Users that are interested in muse_bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jaechan-repo / muse_bench
View on GitHub
☆33Aug 9, 2024Updated last year
OPTML-Group / Unlearn-Smooth
View on GitHub
[ICML25] Official repo for "Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond…
☆24Sep 27, 2025Updated 9 months ago
locuslab / open-unlearning
View on GitHub
[NeurIPS D&B '25] The one-stop repository for LLM unlearning
☆571Mar 18, 2026Updated 4 months ago
OPTML-Group / Unlearn-Simple
View on GitHub
[NeurIPS25] Official repo for "Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning"
☆45Oct 3, 2025Updated 9 months ago
ethz-spylab / unlearning-vs-safety
View on GitHub
☆27Oct 6, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
jinzhuoran / RWKU
View on GitHub
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024
☆100Sep 30, 2024Updated last year
centerforaisafety / wmdp
View on GitHub
WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning m…
☆176May 29, 2025Updated last year
googleinterns / localizing-paragraph-memorization
View on GitHub
☆15Feb 21, 2024Updated 2 years ago
rucnyz / LeakAgent
View on GitHub
☆29Aug 31, 2025Updated 10 months ago
licong-lin / negative-preference-optimization
View on GitHub
☆76Jul 15, 2024Updated 2 years ago
SALT-NLP / Efficient_Unlearning
View on GitHub
☆38Oct 18, 2023Updated 2 years ago
kernelmachine / demix-data
View on GitHub
Benchmark API for Multidomain Language Modeling
☆25Aug 26, 2022Updated 3 years ago
DYR1 / MoGU
View on GitHub
Our research proposes a novel MoGU framework that improves LLMs' safety while preserving their usability.
☆18Jan 14, 2025Updated last year
INK-USC / FiD-ICL
View on GitHub
"FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning" (ACL 2023)
☆15Jul 24, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
wenzhe-li / Self-MoA
View on GitHub
☆17Feb 4, 2025Updated last year
yikee / FLIP
View on GitHub
Small Reward Models via Backward Inference
☆21May 25, 2026Updated 2 months ago
google-research / lm-extraction-benchmark
View on GitHub
☆307Jun 10, 2026Updated last month
liujch1998 / memo-trap
View on GitHub
☆23Jan 25, 2023Updated 3 years ago
sanketvmehta / lifelong-learning-pretraining-and-sam
View on GitHub
Code for the paper "Mehta, S. V., Patil, D., Chandar, S., & Strubell, E. (2023). An Empirical Investigation of the Role of Pre-training i…
☆18Mar 18, 2024Updated 2 years ago
pluskid / structural-regularity
View on GitHub
☆23Apr 5, 2023Updated 3 years ago
sail-sg / closer-look-LLM-unlearning
View on GitHub
[ICLR 2025] A Closer Look at Machine Unlearning for Large Language Models
☆49Dec 4, 2024Updated last year
chrisliu298 / awesome-llm-unlearning
View on GitHub
A resource repository for machine unlearning in large language models
☆617Updated this week
AntResearchNLP / AlignXplore
View on GitHub
Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals
☆11Jan 8, 2026Updated 6 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
thunlp / FalseQA
View on GitHub
Repo for ACL2023 paper "Won't Get Fooled Again: Answering Questions with False Premises"
☆22Jun 11, 2023Updated 3 years ago
MurrayTom / SG-Bench
View on GitHub
SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types
☆26Nov 29, 2024Updated last year
rishub-tamirisa / tamper-resistance
View on GitHub
[ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"
☆68Jun 9, 2025Updated last year
GCYZSL / O3-LLM-UNLEARNING
View on GitHub
☆19May 18, 2025Updated last year
huseyinatahaninan / Differentially-Private-Fine-tuning-of-Language-Models
View on GitHub
☆79May 28, 2022Updated 4 years ago
zzwjames / FailureLLMUnlearning
View on GitHub
An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)
☆39Feb 22, 2025Updated last year
Li-Hyn / LLM_CatastrophicForgetting
View on GitHub
Code for LLM_Catastrophic_Forgetting via SAM.
☆11Jun 7, 2024Updated 2 years ago
agentic-learning-ai-lab / anticipatory-recovery
View on GitHub
Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"
☆11Oct 27, 2025Updated 8 months ago
Hazelsuko07 / InstaHide_Challenge
View on GitHub
A challenge to investigate the security of the InstaHide protocol.
☆12Dec 7, 2020Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
CatOneTwo / WSDDN-PyTorch
View on GitHub
Implementation of Weakly Supervised Deep Detection Networks with PyTorch
☆12Dec 7, 2022Updated 3 years ago
princeton-polaris-lab / Evaluating-Durable-Safeguards
View on GitHub
[ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs
☆13Jun 20, 2025Updated last year
yjw1029 / Self-Reminder
View on GitHub
Code for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder" in NMI.
☆57Nov 13, 2023Updated 2 years ago
wonderNefelibata / Awesome-LRM-Safety
View on GitHub
Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …
☆84Updated this week
keven980716 / weak-to-strong-deception
View on GitHub
[ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"
☆15Jun 21, 2024Updated 2 years ago
dong-river / Personalized-Judge
View on GitHub
☆10Jun 15, 2024Updated 2 years ago
DependableSystemsLab / MIA_defense_HAMP
View on GitHub
Code for the paper "Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction" …
☆13Sep 6, 2023Updated 2 years ago