tanganke / peta
Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"
☆20Updated 5 months ago
Alternatives and similar repositories for peta:
Users that are interested in peta are comparing it to the libraries listed below
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆41Updated 4 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆67Updated 4 months ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆39Updated 5 months ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆49Updated last week
- Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic☆23Updated 2 months ago
- Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"☆20Updated 9 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆97Updated last year
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆50Updated 4 months ago
- [NeurIPS2024] Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging☆51Updated 3 months ago
- ☆60Updated 5 months ago
- Data distillation benchmark☆57Updated 3 weeks ago
- LCA-on-the-line (ICML 2024 Oral)☆11Updated 3 weeks ago
- ☆53Updated 2 months ago
- Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning☆30Updated 3 months ago
- [EMNLP 2023 Main] Sparse Low-rank Adaptation of Pre-trained Language Models☆72Updated last year
- [CVPR2024 highlight] Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching (G-VBSM)☆27Updated 5 months ago
- ☆16Updated 3 weeks ago
- Elucidated Dataset Condensation (NeurIPS 2024)☆19Updated 5 months ago
- The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)☆40Updated last year
- ☆17Updated 3 months ago
- Implementaiton of "DiLM: Distilling Dataset into Language Model for Text-level Dataset Distillation" (accepted by NAACL2024 Findings)".☆17Updated last month
- Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"☆13Updated 11 months ago
- ☆15Updated 9 months ago
- [CVPR23] "Understanding and Improving Visual Prompting: A Label-Mapping Perspective" by Aochuan Chen, Yuguang Yao, Pin-Yu Chen, Yihua Zha…☆52Updated last year
- source code of (quasi-)Givens Orthogonal Fine Tuning integrated to peft lib☆14Updated 5 months ago
- Codes for Merging Large Language Models☆29Updated 7 months ago
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆23Updated this week
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆34Updated 11 months ago
- LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters☆30Updated this week