PyTorch implementation of "From Sparse to Soft Mixtures of Experts"
☆69Aug 22, 2023Updated 2 years ago
Alternatives and similar repositories for soft-moe
Users that are interested in soft-moe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆345Apr 2, 2025Updated last year
- ☆21Oct 22, 2025Updated 5 months ago
- ☆17Jun 9, 2024Updated last year
- ☆28Jul 11, 2024Updated last year
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 9 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆159Jul 9, 2025Updated 9 months ago
- ☆19Nov 5, 2024Updated last year
- ☆95Apr 3, 2023Updated 3 years ago
- GoldFinch and other hybrid transformer components☆46Jul 20, 2024Updated last year
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆380Jun 17, 2024Updated last year
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- Code for the paper "Interpreting and Improving Diffusion Models from an Optimization Perspective", appearing in ICML 2024☆14Sep 30, 2024Updated last year
- ☆30Sep 28, 2023Updated 2 years ago
- The official repository for the experiments included in the paper titled "Patch-level Routing in Mixture-of-Experts is Provably Sample-ef…☆14Feb 12, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆18Jun 20, 2024Updated last year
- PELA: Learning Parameter-Efficient Models with Low-Rank Approximation [CVPR 2024]☆19Apr 14, 2024Updated 2 years ago
- An unofficial implementation for paper "DenseCLIP: Extract Free Dense Labels from CLIP"☆24Jan 27, 2022Updated 4 years ago
- [ICLRW'26] EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation☆33Mar 24, 2026Updated 3 weeks ago
- Learning generative models with Sinkhorn Loss☆31Nov 9, 2018Updated 7 years ago
- ☆20Mar 17, 2026Updated 3 weeks ago
- [NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning☆237Dec 3, 2024Updated last year
- Self Reproduction Code of Paper "Reducing Transformer Key-Value Cache Size with Cross-Layer Attention (MIT CSAIL)☆17May 24, 2024Updated last year
- Related papers about Referring Image Segmentation (RIS)☆16Dec 26, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [CVPR 2025] CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answeri…☆52Jun 16, 2025Updated 10 months ago
- Building language models to predict more than one token ahead to enable further ahead predictions.☆12May 22, 2025Updated 10 months ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆21Jul 20, 2024Updated last year
- ICML 2024 Paper "Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies"☆18Jul 10, 2024Updated last year
- This is a SPM12 batch script that runs a standard fMRI preprocessing pipeline on a BIDS formatted data-set.☆13Nov 19, 2020Updated 5 years ago
- SAM4SS: Tailoring SAM and SAM2 for Semantic Segmentation☆11Jul 31, 2024Updated last year
- [IEEE TIP 2025] This repo is the official implementation of "STPNet: Scale-aware Text Prompt Network for Medical Image Segmentation"☆24Jul 17, 2025Updated 8 months ago
- Pseudo-Bag Mixup Augmentation for Multiple Instance Learning-Based Whole Slide Image Classification (IEEE TMI 2024)☆67Mar 17, 2025Updated last year
- ☆12Sep 25, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Recent Advances in Vision-Language Pre-training!☆32Jan 10, 2022Updated 4 years ago
- Streaming Thinking for VideoLLM Streaming Video Understanding☆91Mar 30, 2026Updated 2 weeks ago
- [ECCV 2024] Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models☆56Jul 9, 2024Updated last year
- Official PyTorch implementation of our ICCV2023 paper “When Prompt-based Incremental Learning Does Not Meet Strong Pretraining”☆16Jan 8, 2024Updated 2 years ago
- An awesome list that curates the best Flet tools, tutorials, blogs and more.☆10Jan 8, 2023Updated 3 years ago
- Official Repository for ICML 2024 Paper "OT-CLIP: Understanding and Generalizing CLIP via Optimal Transport"☆23Dec 4, 2025Updated 4 months ago
- ☆15Jul 25, 2024Updated last year