Jiachen-T-Wang / GREATSLinks
☆16Updated 4 months ago
Alternatives and similar repositories for GREATS
Users that are interested in GREATS are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"☆35Updated 3 weeks ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆46Updated 10 months ago
- ☆15Updated 11 months ago
- A Sober Look at Language Model Reasoning☆81Updated last month
- [ACL 25] SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities☆18Updated 4 months ago
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆59Updated 10 months ago
- ☆35Updated 7 months ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated last year
- Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆39Updated 4 months ago
- ☆51Updated last year
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆47Updated 9 months ago
- ☆24Updated 5 months ago
- Lightweight Adapting for Black-Box Large Language Models☆23Updated last year
- Source code for the TMLR paper "Black-Box Prompt Learning for Pre-trained Language Models"☆56Updated last year
- ☆71Updated 3 years ago
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆31Updated last year
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆39Updated 9 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆73Updated 10 months ago
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆74Updated 5 months ago
- This is an official repository for "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR2023).☆48Updated last year
- Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples☆38Updated 3 weeks ago
- Official code repository for Correct-N-Contrast☆22Updated 3 years ago
- EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue