tanganke / weight-ensembling_MoE
Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"
☆18Updated 8 months ago
Alternatives and similar repositories for weight-ensembling_MoE:
Users that are interested in weight-ensembling_MoE are comparing it to the libraries listed below
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆65Updated 3 months ago
- [NeurIPS2024] Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging☆48Updated 2 months ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆44Updated 3 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆25Updated last week
- Codebase for decoding compressed trust.☆23Updated 9 months ago
- ☆17Updated 2 months ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated 8 months ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆38Updated 4 months ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆39Updated 3 months ago
- A curated list of Model Merging methods.☆89Updated 5 months ago
- Codes for Merging Large Language Models☆29Updated 6 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆69Updated 3 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆27Updated 11 months ago
- Implementation of PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆32Updated 3 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆49Updated 4 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆41Updated 4 months ago
- Mosaic IT: Enhancing Instruction Tuning with Data Mosaics☆17Updated last week
- Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"☆18Updated 5 months ago
- Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)☆33Updated last year
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆52Updated 4 months ago
- ConceptVectors Benchmark and Code for the paper "Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces"☆32Updated last week
- ☆11Updated 6 months ago
- ☆37Updated 3 months ago
- ☆61Updated this week
- ☆27Updated 3 months ago
- ☆31Updated last year
- ☆37Updated last year
- ☆12Updated 11 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆39Updated 3 months ago