Harry-Chen/InfMoE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Harry-Chen/InfMoE)

Harry-Chen / InfMoE

Inference framework for MoE layers based on TensorRT with Python binding

☆40

Alternatives and similar repositories for InfMoE

Users that are interested in InfMoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Raphael-Hao / brainstorm
View on GitHub
Compiler for Dynamic Neural Networks
☆45Nov 13, 2023Updated 2 years ago
bytedance / QSync
View on GitHub
Official resporitory for "IPDPS' 24 QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices".
☆20Feb 23, 2024Updated 2 years ago
raywan-110 / AdaQP
View on GitHub
Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training
☆24Mar 1, 2024Updated 2 years ago
laekov / fastmoe
View on GitHub
A fast MoE impl for PyTorch
☆1,856Feb 10, 2025Updated last year
roastduck / FreeTensor
View on GitHub
A language and compiler for irregular tensor programs.
☆152Jul 16, 2026Updated last week
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
YJHMITWEB / ExFlow
View on GitHub
Explore Inter-layer Expert Affinity in MoE Model Inference
☆16May 6, 2024Updated 2 years ago
BDAI-Research / DFLOP
View on GitHub
☆17Apr 16, 2026Updated 3 months ago
thu-pacman / HyQuas
View on GitHub
A hybrid partitioner based quantum circuit simulation system on GPU
☆46Aug 17, 2022Updated 3 years ago
zxytim / arithmetic-encoding-compression
View on GitHub
☆11Apr 3, 2023Updated 3 years ago
thu-pacman / PET
View on GitHub
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆126Jun 23, 2022Updated 4 years ago
S-Lab-System-Group / Hydro
View on GitHub
Surrogate-based Hyperparameter Tuning System
☆30Jun 29, 2023Updated 3 years ago
TsinghuaAI / TDS
View on GitHub
A plug-in of Microsoft DeepSpeed to fix the bug of DeepSpeed pipeline
☆25Apr 16, 2021Updated 5 years ago
KireinaHoro / khemu
View on GitHub
Binary translation in Rust
☆12Jun 22, 2020Updated 6 years ago
jiegec / verilog-lang
View on GitHub
A hand-written recursive decent Verilog parser.
☆10Jun 28, 2026Updated 3 weeks ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
dramforever / finlog
View on GitHub
Compiling finite generators to digital logic. WIP
☆13Aug 24, 2020Updated 5 years ago
TsinghuaAI / CPM-2-Pretrain
View on GitHub
Code for CPM-2 Pre-Train
☆157Mar 18, 2023Updated 3 years ago
laekov / panleaf
View on GitHub
Write pandoc markdown in OverLeaf
☆12Sep 28, 2022Updated 3 years ago
KireinaHoro / khtcp
View on GitHub
PKU CompNet'19 Lab 2 - Homebrew TCP
☆12Nov 29, 2019Updated 6 years ago
t123yh / MIPSCPU
View on GitHub
A simple MIPS CPU for BUAA CO course (and now NSCSCC).
☆10May 15, 2021Updated 5 years ago
hku-systems / naspipe
View on GitHub
☆14Jan 12, 2022Updated 4 years ago
uwsampl / nexus
View on GitHub
☆85Feb 5, 2026Updated 5 months ago
tonyzhao-jt / LLM-PQ
View on GitHub
Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …
☆39Aug 29, 2025Updated 10 months ago
lzhangbv / acpsgd
View on GitHub
[ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning
☆10Apr 28, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
thu-cs-lab / TCP-Lab-Docs
View on GitHub
Documentation for TCP Lab
☆12May 15, 2026Updated 2 months ago
comaniac / epoi
View on GitHub
Benchmark PyTorch Custom Operators
☆14Jul 6, 2023Updated 3 years ago
haoxizhong / TUOJ
View on GitHub
Let's discover a new world. — Edit
☆10Jan 6, 2017Updated 9 years ago
UT-InfraAI / cuco
View on GitHub
An agent for CUDA compute-communication kernel co-design
☆35May 7, 2026Updated 2 months ago
S-Lab-System-Group / Primo
View on GitHub
Primo: Practical Learning-Augmented Systems with Interpretable Models
☆19Dec 26, 2023Updated 2 years ago
SymbioticLab / ModelKeeper
View on GitHub
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
☆36Jan 9, 2023Updated 3 years ago
gsampler9 / gSampler
View on GitHub
☆29Aug 14, 2024Updated last year
sjtu-epcc / DVABatch
View on GitHub
☆21May 13, 2022Updated 4 years ago
Faraz9877 / H100_GEMM
View on GitHub
High-performance GEMM implementation optimized for NVIDIA H100 GPUs, leveraging Hopper architecture's TMA, WGMMA, and Thread Block Cluste…
☆11Dec 4, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
thu-pacman / lab-guide
View on GitHub
Everything about PACMAN!
☆19May 28, 2026Updated last month
wangrunji0408 / rjrouter
View on GitHub
[AFK] Hardware router in Chisel (THU Network Joint Lab 2020)
☆14Oct 8, 2020Updated 5 years ago
pgera / efg
View on GitHub
GPU based Compressed Graph Traversal
☆16Jan 9, 2026Updated 6 months ago
netx-repo / PipeSwitch
View on GitHub
PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
☆127May 9, 2022Updated 4 years ago
alpa-projects / mms
View on GitHub
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆94Jul 14, 2023Updated 3 years ago
nex-agi / NexDR
View on GitHub
NexDR (Nex Deep Research), a leading deep research agent that autonomously investigates complex topics and generates rich, structured rep…
☆36Dec 4, 2025Updated 7 months ago
nex-agi / NexHTML
View on GitHub
HTML Agent based on NexAU
☆16Nov 20, 2025Updated 8 months ago