swj0419 / muse_benchView external linksLinks
☆33Mar 13, 2025Updated 11 months ago
Alternatives and similar repositories for muse_bench
Users that are interested in muse_bench are comparing it to the libraries listed below
Sorting:
- ☆31Aug 9, 2024Updated last year
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024☆90Sep 30, 2024Updated last year
- [NeurIPS D&B '25] The one-stop repository for LLM unlearning☆479Dec 24, 2025Updated last month
- ☆15Feb 21, 2024Updated last year
- [NeurIPS25] Official repo for "Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning"☆41Oct 3, 2025Updated 4 months ago
- [NeurIPS 2022] Explaining Graph Neural Networks with Structure-Aware Cooperative Games (GStarX)☆14Oct 20, 2022Updated 3 years ago
- ☆28Aug 31, 2025Updated 5 months ago
- WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning m…☆158May 29, 2025Updated 8 months ago
- ☆37Oct 18, 2023Updated 2 years ago
- [ICLR 2025] A Closer Look at Machine Unlearning for Large Language Models☆44Dec 4, 2024Updated last year
- Private Adaptive Optimization with Side Information (ICML '22)☆16Jun 23, 2022Updated 3 years ago
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆24Nov 29, 2024Updated last year
- Code for the paper "Mehta, S. V., Patil, D., Chandar, S., & Strubell, E. (2023). An Empirical Investigation of the Role of Pre-training i…☆17Mar 18, 2024Updated last year
- ☆20Feb 11, 2024Updated 2 years ago
- Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach. EMNLP 2022.☆23Nov 23, 2022Updated 3 years ago
- [ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation…☆141May 27, 2025Updated 8 months ago
- ☆19Mar 6, 2023Updated 2 years ago
- ☆21Mar 17, 2025Updated 10 months ago
- [ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"☆66Jun 9, 2025Updated 8 months ago
- Code for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder" in NMI.☆56Nov 13, 2023Updated 2 years ago
- ☆60Mar 9, 2023Updated 2 years ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆37Feb 22, 2025Updated 11 months ago
- ☆29Feb 10, 2025Updated last year
- ☆73Jul 15, 2024Updated last year
- A survey of privacy problems in Large Language Models (LLMs). Contains summary of the corresponding paper along with relevant code☆69May 30, 2024Updated last year
- ☆25Nov 14, 2022Updated 3 years ago
- Official repo for EMNLP'24 paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"☆29Oct 1, 2024Updated last year
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆71Nov 1, 2022Updated 3 years ago
- [NeurIPS 2024] Large Language Model Unlearning via Embedding-Corrupted Prompts☆38Sep 26, 2024Updated last year
- Code and data to go with the Zhu et al. paper "An Objective for Nuanced LLM Jailbreaks"☆36Dec 18, 2024Updated last year
- ☆78May 28, 2022Updated 3 years ago
- Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …☆81Feb 6, 2026Updated last week
- A re-implementation of the "Extracting Training Data from Large Language Models" paper by Carlini et al., 2020☆38Jul 10, 2022Updated 3 years ago
- Code for LaMPP: Language Models as Probabilistic Priors for Perception and Action☆37Apr 3, 2023Updated 2 years ago
- Trending projects & awesome papers about data-centric llm studies.☆40May 20, 2025Updated 8 months ago
- [EMNLP 2025 Main] ConceptVectors Benchmark and Code for the paper "Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces"☆39Aug 20, 2025Updated 5 months ago
- [NAACL 2025 Main] Official Implementation of MLLMU-Bench☆48Mar 13, 2025Updated 11 months ago
- https://icml.cc/virtual/2023/poster/24354☆10Aug 15, 2023Updated 2 years ago
- ☆10Oct 2, 2024Updated last year