ZhentingWang / DUMP
☆11Updated last week
Alternatives and similar repositories for DUMP:
Users that are interested in DUMP are comparing it to the libraries listed below
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆11Updated last week
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆36Updated last week
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆77Updated 6 months ago
- [ICML 2023] "Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?" by Ruisi Cai, Zhenyu Zhang, Zhangyang Wang☆16Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆10Updated 3 weeks ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆26Updated 2 months ago
- ☆15Updated last week
- ☆20Updated 2 weeks ago
- ☆27Updated last year
- ☆18Updated 6 months ago
- Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"☆9Updated 3 months ago
- ☆20Updated 2 months ago
- Code for T-MARS data filtering☆35Updated last year
- ☆13Updated 8 months ago
- ☆14Updated last month
- ☆18Updated 9 months ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆43Updated last week
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups☆29Updated 4 months ago
- ☆17Updated last year
- Codebase for decoding compressed trust.