toy reproduction of Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts
☆31Sep 1, 2024Updated last year
Alternatives and similar repositories for lossfreebalance
Users that are interested in lossfreebalance are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆37Feb 26, 2024Updated 2 years ago
- [KDD 2025] MM-Path: Multi-modal, Multi-granularity Path Representation Learning.☆16Jan 9, 2025Updated last year
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- [AAAI 2025] Holistic Semantic Representation for Navigational Trajectory Generation☆18Mar 7, 2026Updated 2 months ago
- ☆27Jun 29, 2025Updated 10 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for "Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning" (EMNLP 2022) and "Empowering Parameter-Efficient Transfer Learning…☆11Feb 6, 2023Updated 3 years ago
- A free and open-source focus stacking software that supports multi-focus image alignment and fusion.☆26Feb 5, 2026Updated 3 months ago
- ROS package for sending robots to a series of waypoints☆10Dec 10, 2021Updated 4 years ago
- Technical Challenge Repository for Visual Anomaly Detection Workshop (VAND) at CVPR☆14Jul 21, 2025Updated 9 months ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆30Feb 6, 2026Updated 3 months ago
- Travel Time Prediction Based on Tensor Decomposition and Graph Embedding☆29Dec 25, 2020Updated 5 years ago
- Spectral Sphere Optimizer☆116Mar 23, 2026Updated last month
- Automated neural architecture search algorithms implemented in PyTorch and Autogluon toolkit.☆12Apr 17, 2020Updated 6 years ago
- Code for the paper "Representing Spatial Trajectories as Distributions"☆13Jan 17, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Cross Visual Prompt Tuning [ICCV 2025]☆13Aug 3, 2025Updated 9 months ago
- [AAAI 2023] Official implementation of FiTs: Fine-grained Two-stage Training for Knowledge Base Question Answering☆11Mar 10, 2023Updated 3 years ago
- Triton implement of bi-directional (non-causal) linear attention☆75Mar 1, 2026Updated 2 months ago
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆100Apr 7, 2026Updated last month
- DCIC22数字中国22-牛只图像分割竞赛第四名方案☆14Jul 18, 2022Updated 3 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated 2 years ago
- 3D deformable convolution network(DCN) for head and neck tumor segmentation☆11May 4, 2023Updated 3 years ago
- ☆22Apr 14, 2025Updated last year
- Code for "Boosting Semi-supervised Image Segmentation with Global and Local Mutual Information Regularization"☆13Jul 14, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆11Dec 30, 2024Updated last year
- [NeurIPS 2025] Official implementation for our paper "Scaling Diffusion Transformers Efficiently via μP".☆98Nov 2, 2025Updated 6 months ago
- Official code for "In Search of Robust Measures of Generalization" (NeurIPS 2020)☆28Dec 22, 2020Updated 5 years ago
- [ICML2022] "Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning"☆10Jul 24, 2022Updated 3 years ago
- (WACV'24) Kaizen: Practical self-supervised continual learning with continual fine-tuning☆16Oct 29, 2024Updated last year
- Pytorch routines for (Ker)nel (Mac)hines☆12Oct 10, 2025Updated 7 months ago
- Stanford Cars dataset by classes folder☆19Nov 7, 2024Updated last year
- ☆21Jul 23, 2025Updated 9 months ago
- [ICML 2026] Esoteric Language Models☆117May 1, 2026Updated 2 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official TensorFlow implementation of "RECALL: Replay-based Continual Learning in Semantic Segmentation", ICCV 2021☆19Oct 7, 2021Updated 4 years ago
- ☆22Dec 23, 2024Updated last year
- ☆12Dec 30, 2020Updated 5 years ago
- This is the official PyTorch implementation of ASAG (ICCV 2023).☆18Sep 9, 2023Updated 2 years ago
- [cvpr2023] implementation of out-of-candidate rectification methods☆15Feb 28, 2023Updated 3 years ago
- 在监控画质下实现对校园自行车的重识别,包含REID模型识别,向量数据库检索,UI展示☆11Feb 13, 2024Updated 2 years ago
- An Enterprise LLM chat system using LibreChat, AWS Bedrock and LDAP/AD Authentication☆16Mar 5, 2026Updated 2 months ago