tanganke / weight-ensembling_MoELinks

Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"

☆29

Alternatives and similar repositories for weight-ensembling_MoE

Users that are interested in weight-ensembling_MoE are comparing it to the libraries listed below

Sorting:

EnnengYang / AdaMerging
AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.
☆88Updated 9 months ago
harveyhuang18 / EMR_Merging
[NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging
☆62Updated 5 months ago
which47 / LLMCL
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning
☆34Updated 8 months ago
EnnengYang / RepresentationSurgery
Representation Surgery for Multi-Task Model Merging. ICML, 2024.
☆46Updated 9 months ago
nik-dim / tall_masks
Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]
☆47Updated 9 months ago
clemneo / llava-interp
☆61Updated 9 months ago
tanganke / peta
Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"
☆21Updated 10 months ago
Dongping-Chen / MLLM-Judge
[ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.
☆78Updated 5 months ago
TUDB-Labs / MixLoRA
State-of-the-art Parameter-Efficient MoE Fine-tuning Method
☆175Updated 11 months ago
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆81Updated last month
Model-GLUE / Model-GLUE
☆15Updated 11 months ago
Dereck0602 / Awesome_Test_Time_LLMs
☆117Updated 4 months ago
ycjing / Awesome-Model-Merging
A curated list of Model Merging methods.
☆92Updated 10 months ago
osehmathias / lisa
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
☆33Updated last year
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆107Updated last month
MrGGLS / BlockPruner
A block pruning framework for LLMs.
☆24Updated 2 months ago
TsinghuaC3I / SoRA
[EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models
☆80Updated last year
yaojin17 / Unlearning_LLM
[ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"
☆59Updated 10 months ago
r-three / smear
☆30Updated last year
yule-BUAA / MergeLLM
Codes for Merging Large Language Models
☆33Updated last year
horseee / CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆81Updated 5 months ago
nickjiang2378 / vl-interp
Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations" (ICLR '25)
☆76Updated 2 months ago
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆42Updated 9 months ago
locuslab / massive-activations
Code accompanying the paper "Massive Activations in Large Language Models"
☆174Updated last year
prateeky2806 / ties-merging
☆185Updated last year
VITA-Group / SEAL
Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free
☆39Updated 4 months ago
mrflogs / LoRA-Pro
Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "
☆127Updated 4 months ago
adymaharana / d2pruning
☆35Updated last year
GATECH-EIC / ACT
[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…
☆40Updated last year
socialfoundations / tttlm
Test-time-training on nearest neighbors for large language models
☆45Updated last year