toy reproduction of Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts
☆31Sep 1, 2024Updated last year
Alternatives and similar repositories for lossfreebalance
Users that are interested in lossfreebalance are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆36Feb 26, 2024Updated 2 years ago
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- mobile DFF dataset☆12Nov 26, 2018Updated 7 years ago
- [ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.☆110Dec 20, 2024Updated last year
- Code for the paper All-in-focus Imaging from Event Focal Stack, CVPR 2023.☆13Oct 3, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official Pytorch implementation of 'Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning'? (ICLR2024)☆13Mar 8, 2024Updated 2 years ago
- Technical Challenge Repository for Visual Anomaly Detection Workshop (VAND) at CVPR☆13Jul 21, 2025Updated 8 months ago
- Official Implementation of GMR-Conv☆16Feb 15, 2026Updated last month
- Spectral Sphere Optimizer☆114Mar 23, 2026Updated 2 weeks ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- Automated neural architecture search algorithms implemented in PyTorch and Autogluon toolkit.☆12Apr 17, 2020Updated 5 years ago
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆98Apr 1, 2026Updated last week
- Cross Visual Prompt Tuning [ICCV 2025]☆13Aug 3, 2025Updated 8 months ago
- an official PyTorch implementation of the paper "Partial Network Cloning", CVPR 2023☆13Mar 21, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [AAAI 2023] Official implementation of FiTs: Fine-grained Two-stage Training for Knowledge Base Question Answering☆11Mar 10, 2023Updated 3 years ago
- Triton implement of bi-directional (non-causal) linear attention☆73Mar 1, 2026Updated last month
- DCIC22数字中国22-牛只图像分割竞赛第四名方案☆14Jul 18, 2022Updated 3 years ago
- ☆15Mar 30, 2025Updated last year
- ☆25Nov 8, 2021Updated 4 years ago
- [NeurIPS 2025] Official implementation for our paper "Scaling Diffusion Transformers Efficiently via μP".☆96Nov 2, 2025Updated 5 months ago
- i-mae Pytorch Repo☆20Apr 6, 2024Updated 2 years ago
- Official code for "In Search of Robust Measures of Generalization" (NeurIPS 2020)☆28Dec 22, 2020Updated 5 years ago
- Esoteric Language Models☆115Mar 27, 2026Updated last week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [ICML2022] "Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning"☆10Jul 24, 2022Updated 3 years ago
- Pytorch routines for (Ker)nel (Mac)hines☆11Oct 10, 2025Updated 5 months ago
- Stanford Cars dataset by classes folder☆19Nov 7, 2024Updated last year
- Official TensorFlow implementation of "RECALL: Replay-based Continual Learning in Semantic Segmentation", ICCV 2021☆19Oct 7, 2021Updated 4 years ago
- [cvpr2023] implementation of out-of-candidate rectification methods☆15Feb 28, 2023Updated 3 years ago
- Code for "Learning Unitary Operators with Help From u(n)", AAAI-17. (https://arxiv.org/abs/1607.04903)☆17Jan 10, 2017Updated 9 years ago
- [WACV 2024] BALF: Simple and Efficient Blur Aware Local Feature Detector☆25Mar 9, 2026Updated 3 weeks ago
- uncertainty-guided matting on ICML2023☆12Aug 3, 2023Updated 2 years ago
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆22Feb 19, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- AlignX-Family is an open-source research suite for advancing personalization in large language models-spanning data, code, models, and be…☆20Jan 12, 2026Updated 2 months ago
- [ACM MM'23] Official implementation of paper "Avatar Knowledge Distillation: Self-ensemble Teacher Paradigm with Uncertainty".☆14Nov 22, 2023Updated 2 years ago
- Inverted triple Pendulum☆16May 13, 2019Updated 6 years ago
- ☆10Dec 9, 2021Updated 4 years ago
- Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer☆16Sep 7, 2024Updated last year
- A fast approach for translating a series of text prompts into a video. The 2022 NeurIPS Workshop on Machine Learning for Creativity and D…☆33Jul 5, 2023Updated 2 years ago
- A ZLE function that can create codex suggestions☆11Nov 30, 2022Updated 3 years ago