Prototyp MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
☆27Apr 4, 2025Updated 11 months ago
Alternatives and similar repositories for MegaScale-Infer-Prototyp
Users that are interested in MegaScale-Infer-Prototyp are comparing it to the libraries listed below
Sorting:
- Accepted to MLSys 2026☆72Mar 5, 2026Updated 2 weeks ago
- RPCNIC: A High-Performance and Reconfigurable PCIe-attached RPC Accelerator [HPCA2025]☆14Dec 9, 2024Updated last year
- Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model☆13Feb 11, 2025Updated last year
- MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.☆16Apr 8, 2025Updated 11 months ago
- 🎓Automatically Update LLM inference systems Papers Daily using Github Actions (Update Every 12th hours)☆12Updated this week
- Slowdown prediction module of Echo: Simulating Distributed Training at Scale☆13May 17, 2025Updated 10 months ago
- Modular RDMA Interface☆94Updated this week
- Python library to add support for embedding natural code in Python with shared program state.☆24Jan 20, 2026Updated 2 months ago
- High-performance distributed data shuffling (all-to-all) library for MoE training and inference☆114Mar 7, 2026Updated last week
- A better wrapper for using RDMA programming APIs in Rust flavor☆79Updated this week
- ☆13Mar 24, 2024Updated last year
- NVIDIA Networking NIC Configuration Operator For Kubernetes☆15Updated this week
- Simulating Distributed Training at Scale☆14Sep 15, 2025Updated 6 months ago
- A framework for generating realistic LLM serving workloads☆106Oct 9, 2025Updated 5 months ago
- ☆17Oct 22, 2020Updated 5 years ago
- Memory Topology for GPUs☆19Mar 4, 2026Updated 2 weeks ago
- ☆11Apr 23, 2020Updated 5 years ago
- [TBD] "m4: A Learned Flow-level Network Simulator" by Chenning Li, Anton A. Zabreyko, Om Chabra, Arash Nasr-Esfahany, Kevin Zhao, Pratees…☆18Nov 18, 2025Updated 4 months ago
- Simple PyTorch graph capturing.☆21May 31, 2023Updated 2 years ago
- ☆36Jan 10, 2026Updated 2 months ago
- Plato is a system for viewport adaptation based bitrate adaptive VR video streaming.☆16May 1, 2018Updated 7 years ago
- [ICLR 2025] RaSA: Rank-Sharing Low-Rank Adaptation☆10May 19, 2025Updated 10 months ago
- Heterogeneous Gaussian Mechanism: Preserving Differential Privacy in Deep Learning with Provable Robustness (IJCAI'19).☆13Apr 16, 2021Updated 4 years ago
- The Easiest Pytorch Implementation of Branching-DQN☆12Feb 10, 2021Updated 5 years ago
- A deep model for speech recognition via Keras(front_end) and TensorFlow(back_end).☆12Feb 16, 2023Updated 3 years ago
- ☆44Sep 8, 2025Updated 6 months ago
- Release the power of GPT☆11May 27, 2024Updated last year
- Implementation for the paper: CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference☆35Mar 6, 2025Updated last year
- [MobiCom '23] AccuMO: Accuracy-Centric Multitask Offloading in Edge-Assisted Mobile Augmented Reality☆18Oct 8, 2023Updated 2 years ago
- Low-Latency Live Video Streaming over a Low-Earth-Orbit Satellite Network with DASH☆18Sep 6, 2024Updated last year
- Blazing fast data loading with HuggingFace Dataset and Ray Data☆16Jan 12, 2024Updated 2 years ago
- NoC simulation using gem5 (a simple tul)☆14Mar 23, 2024Updated last year
- plget is a tool used to measure latency packets spent in network stack, NIC driver and on the wire, trace interpacket gap, based as on h/…☆16Nov 18, 2019Updated 6 years ago
- ☆23Sep 17, 2024Updated last year
- vLLM adapter for a TGIS-compatible gRPC server.☆55Updated this week
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆96Mar 5, 2026Updated 2 weeks ago
- nv-one-logger enables tracking of GPU application progress over time and can help to identify overhead from workload and cluster ineffici…☆22Nov 6, 2025Updated 4 months ago
- Converting text-LMs into Visual Language Models☆51Jan 31, 2026Updated last month
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with…☆36Jan 31, 2026Updated last month