Prototyp MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
☆27Apr 4, 2025Updated last year
Alternatives and similar repositories for MegaScale-Infer-Prototyp
Users that are interested in MegaScale-Infer-Prototyp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Accepted to MLSys 2026☆75Mar 5, 2026Updated last month
- RPCNIC: A High-Performance and Reconfigurable PCIe-attached RPC Accelerator [HPCA2025]☆14Dec 9, 2024Updated last year
- Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model☆13Feb 11, 2025Updated last year
- ☆16Feb 10, 2023Updated 3 years ago
- Scaling Laws for Mixture of Experts Models☆15Feb 25, 2025Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- 🎓Automatically Update LLM inference systems Papers Daily using Github Actions (Update Every 12th hours)☆12Updated this week
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆37Aug 29, 2025Updated 7 months ago
- Slowdown prediction module of Echo: Simulating Distributed Training at Scale☆13May 17, 2025Updated 10 months ago
- Mixture-of-Experts Multimodal Variational Autoencoder☆15Jul 3, 2025Updated 9 months ago
- [ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization☆25Oct 5, 2025Updated 6 months ago
- The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"☆19Jan 25, 2025Updated last year
- Modular RDMA Interface☆109Updated this week
- Python library to add support for embedding natural code in Python with shared program state.☆28Jan 20, 2026Updated 2 months ago
- An asynchronous streaming data management module for efficient post-training.☆42Updated this week
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- High-performance distributed data shuffling (all-to-all) library for MoE training and inference☆117Mar 7, 2026Updated last month
- Official code for "Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping" (ICLR 2025)☆29Oct 25, 2025Updated 5 months ago
- [SIGCOMM 2023] PacketGame: Multi-Stream Packet Gating for Concurrent Video Inference at Scale☆15Jul 1, 2023Updated 2 years ago
- Extension for https://github.com/jenkinsci/workflow-multibranch-plugin add defaults pipeline script☆21Dec 16, 2023Updated 2 years ago
- NVIDIA Networking NIC Configuration Operator For Kubernetes☆15Updated this week
- Optimal Transport and Optimization related experiments.☆10Jul 22, 2018Updated 7 years ago
- Simulating Distributed Training at Scale☆14Sep 15, 2025Updated 6 months ago
- A framework for generating realistic LLM serving workloads☆110Oct 9, 2025Updated 6 months ago
- ☆11Apr 23, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Simple PyTorch graph capturing.☆21May 31, 2023Updated 2 years ago
- [CVPR 2025] Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts☆23Jun 22, 2025Updated 9 months ago
- ☆38Jan 10, 2026Updated 2 months ago
- Plato is a system for viewport adaptation based bitrate adaptive VR video streaming.☆16May 1, 2018Updated 7 years ago
- [ICLR 2025] RaSA: Rank-Sharing Low-Rank Adaptation☆10May 19, 2025Updated 10 months ago
- The Easiest Pytorch Implementation of Branching-DQN☆12Feb 10, 2021Updated 5 years ago
- A deep model for speech recognition via Keras(front_end) and TensorFlow(back_end).☆12Feb 16, 2023Updated 3 years ago
- Spatial Transformer Network (STN) provides attention to a particular region to in an image, by doing transformation to the input image. T…☆15Dec 21, 2020Updated 5 years ago
- ☆46Sep 8, 2025Updated 7 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [MobiCom '23] AccuMO: Accuracy-Centric Multitask Offloading in Edge-Assisted Mobile Augmented Reality☆18Oct 8, 2023Updated 2 years ago
- Low-Latency Live Video Streaming over a Low-Earth-Orbit Satellite Network with DASH☆18Sep 6, 2024Updated last year
- Blazing fast data loading with HuggingFace Dataset and Ray Data☆16Jan 12, 2024Updated 2 years ago
- Implementation of the ICLR 2025 paper "Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models"☆28Apr 2, 2025Updated last year
- NoC simulation using gem5 (a simple tul)☆14Mar 23, 2024Updated 2 years ago
- ☆87Mar 20, 2026Updated 2 weeks ago
- ☆24Sep 17, 2024Updated last year