adymaharana / d2pruning
☆25Updated 11 months ago
Related projects: ⓘ
- On the Effectiveness of Parameter-Efficient Fine-Tuning☆38Updated 10 months ago
- ☆32Updated 10 months ago
- ☆24Updated 11 months ago
- [NeurIPS2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆28Updated last year
- ☆31Updated 8 months ago
- ☆42Updated 5 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆40Updated 2 weeks ago
- Restore safety in fine-tuned language models through task arithmetic☆25Updated 5 months ago
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆59Updated last year
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆34Updated 4 months ago
- TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models☆56Updated 7 months ago
- ☆29Updated last year
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆16Updated 2 months ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆64Updated 6 months ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆54Updated 9 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆90Updated 5 months ago
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.☆13Updated 4 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆61Updated last year
- ☆38Updated 8 months ago
- Mosaic IT: Enhancing Instruction Tuning with Data Mosaics☆13Updated 2 months ago
- ☆21Updated 2 months ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆27Updated 3 months ago
- [ACL 2023] Code for paper “Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation”(https://arxiv.org/abs/2305.…☆38Updated last year
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]☆23Updated 3 months ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆42Updated last year
- One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning☆36Updated last year
- ☆12Updated 3 months ago
- Source code for the TMLR paper "Black-Box Prompt Learning for Pre-trained Language Models"☆55Updated last year
- ☆40Updated 5 months ago
- The source code of the EMNLP 2023 main conference paper: Sparse Low-rank Adaptation of Pre-trained Language Models.☆62Updated 6 months ago