bwconrad / soft-moeLinks
PyTorch implementation of "From Sparse to Soft Mixtures of Experts"
☆66Updated 2 years ago
Alternatives and similar repositories for soft-moe
Users that are interested in soft-moe are comparing it to the libraries listed below
Sorting:
- Official implementation of AAAI 2023 paper "Parameter-efficient Model Adaptation for Vision Transformers"☆104Updated 2 years ago
- A repository for DenseSSMs☆89Updated last year
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆78Updated 2 years ago
- ☆91Updated 2 years ago
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆33Updated 2 years ago
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)☆69Updated 4 months ago
- [EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models☆84Updated last year
- EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)☆32Updated 2 years ago
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆33Updated 2 years ago
- Official code for "TOAST: Transfer Learning via Attention Steering"☆186Updated 2 years ago
- PyTorch implementation of LIMoE☆52Updated last year
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆46Updated last year
- Official Code for NeurIPS 2022 Paper: How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders☆68Updated 2 years ago
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆87Updated 2 years ago
- Mixture of Attention Heads☆51Updated 3 years ago
- [CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong C…☆25Updated 3 years ago
- 🔥MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer [Official, ICLR 2023]☆21Updated 2 years ago
- ImageNetV2 Pytorch Dataset☆42Updated 2 years ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆60Updated 11 months ago
- code for "Multitask Vision-Language Prompt Tuning" https://arxiv.org/abs/2211.11720☆57Updated last year
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆39Updated last year
- Metrics for "Beyond neural scaling laws: beating power law scaling via data pruning " (NeurIPS 2022 Outstanding Paper Award)☆57Updated 2 years ago
- [CVPR'23 & TPAMI'25] Hard Patches Mining for Masked Image Modeling & Bootstrap Masked Visual Modeling via Hard Patch Mining☆105Updated 7 months ago
- Official pytorch implementation of NeurIPS 2022 paper, TokenMixup☆48Updated 2 years ago
- [ICLR2025] This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆87Updated 5 months ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆32Updated last year
- Repository containing code for blockwise SSL training☆30Updated last year
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆334Updated 7 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆56Updated last year
- Distributed Optimization Infra for learning CLIP models☆27Updated last year