infinigence/FUSCO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/infinigence/FUSCO)

infinigence / FUSCO

High-performance distributed data shuffling (all-to-all) library for MoE training and inference

☆112

Alternatives and similar repositories for FUSCO

Users that are interested in FUSCO are comparing it to the libraries listed below

Sorting:

DeepLink-org / DLSlime
View on GitHub
DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit
☆92Jan 26, 2026Updated last month
KuangjuX / NVSHMEM-Tutorial
View on GitHub
NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer
☆165Feb 11, 2026Updated 3 weeks ago
cherichy / tilecute
View on GitHub
☆32Jul 2, 2025Updated 8 months ago
alibaba-edu / qwen-bailian-usagetraces-anon
View on GitHub
☆88Jan 22, 2026Updated last month
eth-easl / sailor
View on GitHub
AI model training on heterogeneous, geo-distributed resources
☆39Nov 24, 2025Updated 3 months ago
flashserve / PAT
View on GitHub
Prefix-Aware Attention for LLM Decoding
☆29Jan 23, 2026Updated last month
triple-mu / HunyuanDiT-TensorRT-libtorch
View on GitHub
HunyuanDiT with TensorRT and libtorch
☆17May 22, 2024Updated last year
lzhangbv / acpsgd
View on GitHub
[ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning
☆10Apr 28, 2023Updated 2 years ago
hao-ai-lab / DistCA
View on GitHub
Efficient Long-context Language Model Training by Core Attention Disaggregation
☆92Updated this week
suzukimain / auto_diffusers
View on GitHub
diffusers with search engine
☆12Jan 13, 2026Updated last month
wu-kan / wuk_cupti_wrapper
View on GitHub
a simple API to use CUPTI
☆11Aug 19, 2025Updated 6 months ago
Mellanox / nic-configuration-operator
View on GitHub
NVIDIA Networking NIC Configuration Operator For Kubernetes
☆14Mar 1, 2026Updated last week
host-architecture / Fast-and-Safe-IO-Memory-Protection
View on GitHub
☆13Nov 21, 2024Updated last year
H-Huang / torch_collective_extension
View on GitHub
A minimum demo for PyTorch distributed extension functionality for collectives.
☆15Jul 29, 2024Updated last year
xdit-project / DiTCacheAnalysis
View on GitHub
An auxiliary project analysis of the characteristics of KV in DiT Attention.
☆33Nov 29, 2024Updated last year
psg-mit / nightjarpy
View on GitHub
Python library to add support for embedding natural code in Python with shared program state.
☆24Jan 20, 2026Updated last month
fpgasystems / Chameleon-RAG-Acceleration
View on GitHub
☆20Jun 1, 2025Updated 9 months ago
thomaschlt / mla.c
View on GitHub
Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.
☆18Jan 15, 2025Updated last year
infinigence / FlashOverlap
View on GitHub
A lightweight design for computation-communication overlap.
☆223Jan 20, 2026Updated last month
thustorage / Medusa
View on GitHub
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆42May 13, 2025Updated 9 months ago
TransferQueue / TransferQueue
View on GitHub
[Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…
☆13Jan 16, 2026Updated last month
jiazhihao / attention_superoptimizer
View on GitHub
An Attention Superoptimizer
☆22Jan 20, 2025Updated last year
lcy-seso / DLFrameworkTest
View on GitHub
My tests and experiments with some popular dl frameworks.
☆17Sep 11, 2025Updated 5 months ago
harnets / multiverse
View on GitHub
GPU-accelerated LLM Training Simulator
☆18Jun 26, 2025Updated 8 months ago
wangrunji0408 / rjrouter
View on GitHub
[AFK] Hardware router in Chisel (THU Network Joint Lab 2020)
☆14Oct 8, 2020Updated 5 years ago
InfiniTensor / TinyInfiniTrain
View on GitHub
训练营训练方向项目
☆26Jan 28, 2026Updated last month
triton-lang / Triton-to-tile-IR
View on GitHub
incubator repo for CUDA-TileIR backend
☆112Updated this week
tile-ai / TileOPs
View on GitHub
☆88Updated this week
gudiandian / ElasticFlow
View on GitHub
☆17May 10, 2024Updated last year
flashinfer-ai / cubloaty
View on GitHub
a size profiler for cuda binary
☆72Jan 15, 2026Updated last month
tile-ai / tilescale
View on GitHub
Tile-based language built for AI computation across all scales
☆138Feb 27, 2026Updated last week
harvard-cns / Harvard-CNS-Seminar
View on GitHub
Reading seminar in Harvard Cloud Networking and Systems Group
☆16Aug 29, 2022Updated 3 years ago
BorisPis / nicmem-asplos22-artifact
View on GitHub
☆18Dec 11, 2023Updated 2 years ago
NVIDIA / nvshmem
View on GitHub
NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process com…
☆471Feb 28, 2026Updated last week
Victarry / PyTorch-Memory-Profiler
View on GitHub
☆42Sep 8, 2025Updated 6 months ago
infinigence / HamiltonAttention
View on GitHub
☆41Oct 15, 2025Updated 4 months ago
nkalyanv99 / UNI-D2
View on GitHub
☆53Jan 23, 2026Updated last month
liangyuRain / ForestColl
View on GitHub
☆16Apr 22, 2025Updated 10 months ago
liangyuwang / Tiny-Megatron
View on GitHub
Tiny-Megatron, a minimalistic re-implementation of the Megatron library
☆23Sep 1, 2025Updated 6 months ago