bwconrad / soft-moeLinks

PyTorch implementation of "From Sparse to Soft Mixtures of Experts"

☆60

Alternatives and similar repositories for soft-moe

Users that are interested in soft-moe are comparing it to the libraries listed below

Sorting:

eric-ai-lab / PEViT
Official implementation of AAAI 2023 paper "Parameter-efficient Model Adaptation for Vision Transformers"
☆105Updated 2 years ago
zhangq327 / U-MAE
Official Code for NeurIPS 2022 Paper: How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders
☆67Updated last year
mlvlab / TokenMixup
Official pytorch implementation of NeurIPS 2022 paper, TokenMixup
☆48Updated 2 years ago
WailordHe / DenseSSM
A repository for DenseSSMs
☆88Updated last year
VITA-Group / Diverse-ViT
[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong C…
☆25Updated 3 years ago
scale-lab / MTLoRA
The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)
☆58Updated last month
UMass-Embodied-AGI / Mod-Squad
☆91Updated 2 years ago
TencentARC / pi-Tuning
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
☆33Updated 2 years ago
fkodom / soft-mixture-of-experts
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
☆75Updated last year
YeonwooSung / LIMoE-pytorch
PyTorch implementation of LIMoE
☆53Updated last year
PKU-ML / non_neg
Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning
☆45Updated last year
Haochen-Wang409 / HPM
[CVPR'23 & TPAMI'25] Hard Patches Mining for Masked Image Modeling
☆100Updated 3 months ago
AmeenAli / HiddenMambaAttn
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
☆226Updated last year
OliverRensu / ARM
[ICLR2025] This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision
☆83Updated 2 months ago
TsinghuaC3I / SoRA
[EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models
☆80Updated last year
HKUNLP / efficient-attention
[EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling
☆86Updated 2 years ago
BaohaoLiao / mefts
[NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning
☆31Updated 2 years ago
ExplainableML / fomo_in_flux
Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]
☆57Updated 7 months ago
zju-vipa / training_free_model_merging
This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).
☆31Updated last year
lucidrains / soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
☆309Updated 4 months ago
liuxingbin / dbot
[ICLR2024] Exploring Target Representations for Masked Autoencoders
☆56Updated last year
modestyachts / ImageNetV2_pytorch
ImageNetV2 Pytorch Dataset
☆41Updated 2 years ago
iancovert / locality-alignment
☆51Updated 6 months ago
rgeirhos / dataset-pruning-metrics
Metrics for "Beyond neural scaling laws: beating power law scaling via data pruning " (NeurIPS 2022 Outstanding Paper Award)
☆56Updated 2 years ago
UCSC-VLAA / DMAE
[CVPR 2023] This repository includes the official implementation our paper "Masked Autoencoders Enable Efficient Knowledge Distillers"
☆107Updated 2 years ago
swaggy-TN / EfficientVLM
EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)
☆30Updated 2 years ago
leo-yangli / VB-LoRA
This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).
☆39Updated 9 months ago
enyac-group / supmae
This is a offical PyTorch/GPU implementation of SupMAE.
☆78Updated 2 years ago
UCDvision / NOLA
Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"
☆55Updated 11 months ago
bfshi / TOAST
Official code for "TOAST: Transfer Learning via Attention Steering"
☆189Updated 2 years ago