kungfu-team/tenplex

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kungfu-team/tenplex)

kungfu-team / tenplex

Dynamic resources changes for multi-dimensional parallelism training

☆31

Alternatives and similar repositories for tenplex

Users that are interested in tenplex are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

microsoft / nnscaler
View on GitHub
nnScaler: Compiling DNN models for Parallel Training
☆135Jul 2, 2026Updated 3 weeks ago
lsds / CubicleOS
View on GitHub
Compartmentalised monolithic library OS
☆22Jul 15, 2021Updated 5 years ago
UMass-LIDS / Proteus
View on GitHub
Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling
☆13Mar 7, 2024Updated 2 years ago
SymbioticLab / Oobleck
View on GitHub
A resilient distributed training framework
☆100Updated this week
ChandlerGuan / mercury_artifact
View on GitHub
☆27Oct 1, 2025Updated 9 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
JF-D / Parcae
View on GitHub
☆22Apr 22, 2024Updated 2 years ago
kangtegong / bpftool
View on GitHub
Automated upstream mirror for bpftool stand-alone build.
☆18Nov 13, 2025Updated 8 months ago
HPDL-Group / Merak
View on GitHub
☆86Feb 11, 2026Updated 5 months ago
TransferQueue / TransferQueue
View on GitHub
[Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…
☆16Jan 16, 2026Updated 6 months ago
siasosp23 / artifacts
View on GitHub
☆24Aug 15, 2023Updated 2 years ago
Mogball / triton_lite
View on GitHub
☆20May 24, 2025Updated last year
microsoft / elasticflow-traces
View on GitHub
Integrated Training Platform (ITP) traces used in ElasticFlow paper.
☆31Dec 23, 2022Updated 3 years ago
thomaschlt / mla.c
View on GitHub
Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.
☆18Jan 15, 2025Updated last year
lemyx / tilelang-dsa
View on GitHub
DeepSeek-V3.2-Exp DSA Warmup Lightning Indexer training operator based on tilelang
☆47Nov 19, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
S-Lab-System-Group / Lucid
View on GitHub
Lucid: A Non-Intrusive, Scalable and Interpretable Scheduler for Deep Learning Training Jobs
☆61May 21, 2023Updated 3 years ago
gajagajago / deepshare
View on GitHub
Network Contention-Aware Cluster Scheduling with Reinforcement Learning (IEEE ICPADS'23)
☆20Jul 8, 2025Updated last year
vipulSharma18 / NCCL-From-First-Principles
View on GitHub
NCCL communication API layer, and transport layer created from first principles.
☆16Aug 20, 2025Updated 11 months ago
CentML / Mist
View on GitHub
[EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
☆24Apr 13, 2026Updated 3 months ago
ByteDance-Seed / StragglerAnalysis
View on GitHub
☆56Apr 30, 2025Updated last year
byungsoo-oh / ml-systems-papers
View on GitHub
Curated collection of papers in machine learning systems
☆636Feb 7, 2026Updated 5 months ago
ChandlerGuan / kperfir_artifact
View on GitHub
☆19May 9, 2025Updated last year
chenyu-jiang / dcp
View on GitHub
Code repository for the SOSP'25 paper DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism.
☆21Nov 28, 2025Updated 7 months ago
hao-ai-lab / DistCA
View on GitHub
Efficient Long-context Language Model Training by Core Attention Disaggregation
☆106Apr 7, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
msr-fiddle / blox
View on GitHub
☆47Jul 4, 2024Updated 2 years ago
radixark / miles_diffusion
View on GitHub
[Experimental] Miles-diffusion is an post-training framework for large-scale diffusion model training and production workloads, forked fr…
☆23Updated this week
foundation-model-stack / vllm-triton-backend
View on GitHub
A Triton-only attention backend for vLLM
☆27Jul 14, 2026Updated last week
promoe-opensource / promoe
View on GitHub
☆20Jan 27, 2025Updated last year
PrimeIntellect-ai / diloco_simple
View on GitHub
torch implementation of diloco
☆24Jul 17, 2026Updated last week
gudiandian / ElasticFlow
View on GitHub
☆17May 10, 2024Updated 2 years ago
joapolarbear / dpro
View on GitHub
Analysis for the traces from byteprofile
☆32Nov 21, 2023Updated 2 years ago
Mr-Linus / NodeSimulator
View on GitHub
NodeSimulator can simulate the node resources and state in kubernetes and simulate the state of pod.
☆11Nov 7, 2021Updated 4 years ago
casys-kaist / EnvPipe
View on GitHub
☆27Aug 31, 2023Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
FlexFusion / FlexFusion
View on GitHub
The official implementation for the intra-stage fusion technique introduced in https://arxiv.org/abs/2409.13221
☆31Apr 22, 2025Updated last year
infinigence / HamiltonAttention
View on GitHub
☆45Oct 15, 2025Updated 9 months ago
lsds / LightSaber
View on GitHub
Multi-core Window-Based Stream Processing Engine
☆74Oct 20, 2021Updated 4 years ago
Yu-Maryland / RESPECT
View on GitHub
RESPECT: Reinforcement Learning based Edge Scheduling on Pipelined Coral Edge TPUs (DAC'23)
☆11Apr 13, 2023Updated 3 years ago
kangtegong / pycon-session-release
View on GitHub
Pycon Korea 2020 발표자료
☆14Sep 25, 2020Updated 5 years ago
Sys-Inventor-Lab / AI4System-OSML
View on GitHub
☆14Feb 26, 2026Updated 4 months ago
HiEST / gpu-topo-aware
View on GitHub
GPU topology-aware scheduler
☆13Jul 7, 2017Updated 9 years ago