mmatena / model_merging
β65Updated 3 years ago
Alternatives and similar repositories for model_merging:
Users that are interested in model_merging are comparing it to the libraries listed below
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".β97Updated last year
- β28Updated 8 months ago
- AI Logging for Interpretability and Explainabilityπ¬β108Updated 9 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)β63Updated 5 months ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]β42Updated 5 months ago
- β16Updated 2 weeks ago
- β93Updated last year
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.β69Updated 2 weeks ago
- β50Updated last year
- β37Updated last year
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.β70Updated 4 months ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptorsβ73Updated 3 months ago
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.β27Updated 10 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"β150Updated last year
- β53Updated 2 years ago
- β30Updated 3 months ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)β79Updated last year
- [ICLR'25 Spotlight] Min-K%++: Improved baseline for detecting pre-training data of LLMsβ36Updated last month
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"β60Updated last year
- Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the pβ¦β11Updated last month
- β169Updated last year
- Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmeticβ23Updated 2 months ago
- β46Updated last year
- Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-modelsβ140Updated 2 years ago
- β71Updated this week
- Evaluate interpretability methods on localizing and disentangling concepts in LLMs.β42Updated 5 months ago
- β28Updated last year
- β33Updated last year
- This is the oficial repository for "Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts" (EMNLP 2022)β100Updated 2 years ago
- Learning adapter weights from task descriptionsβ16Updated last year