OscarXZQ / weight-selectionLinks

☆182

Alternatives and similar repositories for weight-selection

Users that are interested in weight-selection are comparing it to the libraries listed below

Sorting:

bfshi / TOAST
Official code for "TOAST: Transfer Learning via Attention Steering"
☆189Updated last year
lucidrains / soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
☆309Updated 4 months ago
Arnav0400 / ViT-Slim
Official code for our CVPR'22 paper “Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space”
☆250Updated last year
kirill-vish / Beyond-INet
Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"
☆101Updated 10 months ago
kaiyuyue / nxtp
[CVPR'24 Highlight] PyTorch Implementation of Object Recognition as Next Token Prediction
☆180Updated 3 months ago
TomerRonen34 / mixed-resolution-vit
☆51Updated last year
sdc17 / UPop
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
☆105Updated 7 months ago
UCDvision / NOLA
Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"
☆55Updated 11 months ago
mu-cai / matryoshka-mm
Matryoshka Multimodal Models
☆112Updated 6 months ago
VILA-Lab / GBLM-Pruner
Are gradient information useful for pruning of LLMs?
☆46Updated last year
gstoica27 / ZipIt
A framework for merging models solving different tasks with different initializations into one multi-task model without any additional tr…
☆303Updated last year
nbasyl / DoRA
Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"
☆124Updated last year
fkodom / soft-mixture-of-experts
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
☆75Updated last year
eric-ai-lab / PEViT
Official implementation of AAAI 2023 paper "Parameter-efficient Model Adaptation for Vision Transformers"
☆105Updated last year
naver-ai / augsub
[CVPR 2025] Official PyTorch implementation of MaskSub "Masking meets Supervision: A Strong Learning Alliance"
☆45Updated 4 months ago
WailordHe / DenseSSM
A repository for DenseSSMs
☆88Updated last year
gstoica27 / KnOTS
Model Merging with SVD to Tie the KnOTS [ICLR 2025]
☆60Updated 4 months ago
zju-vipa / training_free_model_merging
This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).
☆31Updated last year
mrflogs / LoRA-Pro
Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "
☆127Updated 3 months ago
kyegomez / Mixture-of-Depths
Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆103Updated last week
snu-mllab / LayerMerge
Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML 2024)
☆30Updated 11 months ago
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆98Updated 10 months ago
htqin / IR-QLoRA
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…
☆67Updated last year
naver-ai / model-stock
Model Stock: All we need is just a few fine-tuned models
☆119Updated 10 months ago
magic-research / Dataset_Quantization
[ICCV2023] Dataset Quantization
☆259Updated last year
LeapLabTHU / EfficientTrain
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…
☆222Updated 11 months ago
DAMO-NLP-SG / Inf-CLIP
[CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for C…
☆263Updated 6 months ago
layer6ai-labs / fusemix
Data-Efficient Multimodal Fusion on a Single GPU
☆66Updated last year
FreedomIntelligence / LongLLaVA
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture
☆207Updated 6 months ago
VITA-Group / AsViT
[ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…
☆76Updated 3 years ago