lhb8125/Megatron-LM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lhb8125/Megatron-LM)

lhb8125 / Megatron-LM

Ongoing research training transformer models at scale

☆19

Alternatives and similar repositories for Megatron-LM

Users that are interested in Megatron-LM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Victarry / PP-Schedule-Visualization
View on GitHub
Pipeline Parallelism Emulation and Visualization
☆85Jun 30, 2026Updated 3 weeks ago
kwai / Megatron-Kwai
View on GitHub
LLM training technologies developed by kwai
☆71Jun 30, 2026Updated 3 weeks ago
Bruce-Lee-LY / cutlass_gemm
View on GitHub
Multiple GEMM operators are constructed with cutlass to support LLM inference.
☆20Aug 3, 2025Updated 11 months ago
yzhaiustc / Optimizing-SGEMV-on-NVIDIA-GPUs
View on GitHub
An implementation of SGEMV with performance comparable to cuBLAS.
☆12May 21, 2021Updated 5 years ago
chenyu-jiang / dcp
View on GitHub
Code repository for the SOSP'25 paper DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism.
☆21Nov 28, 2025Updated 8 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
feifeibear / DPSKV3MFU
View on GitHub
Estimate MFU for DeepSeekV3
☆26Jan 5, 2025Updated last year
antgroup / DeepXTrace
View on GitHub
DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.
☆101Jan 16, 2026Updated 6 months ago
inclusionAI / asystem-amem
View on GitHub
A NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.
☆113Dec 17, 2025Updated 7 months ago
yanring / Megatron-MoE-ModelZoo
View on GitHub
Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.
☆201May 29, 2026Updated 2 months ago
OpenSQZ / MegatronApp
View on GitHub
Toolchain built around the Megatron-LM for Distributed Training
☆97May 20, 2026Updated 2 months ago
sail-sg / zero-bubble-pipeline-parallelism
View on GitHub
Zero Bubble Pipeline Parallelism
☆464May 7, 2025Updated last year
ASC-Competition / ASC24-LLM-inference-optimization
View on GitHub
The dataset and baseline code for ASC23 LLM inference optimization challenge.
☆34Dec 20, 2023Updated 2 years ago
biomadeira / BioDownloader
View on GitHub
📦 A Command Line Tool for downloading protein structures, sequences and MSAs
☆10Nov 21, 2017Updated 8 years ago
filmil / bazel-ebook
View on GitHub
bazel build rules for creating ebooks in PDF, EPUB and MOBI format
☆12Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
UNITES-Lab / Occult
View on GitHub
[ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…
☆13Apr 17, 2025Updated last year
adah1972 / gen_systags
View on GitHub
Generates a systags file for Vim use.
☆10Mar 2, 2020Updated 6 years ago
Tele-AI / TeleTron
View on GitHub
To pioneer training long-context multi-modal transformer models
☆75Aug 8, 2025Updated 11 months ago
bertmaher / tf32_gemm
View on GitHub
Example of binding a TF32 CUTLASS GEMM kernel to PyTorch
☆12Jun 7, 2024Updated 2 years ago
HPMLL / DTC-SpMM_ASPLOS24
View on GitHub
☆47Jun 19, 2024Updated 2 years ago
tile-ai / tilescale
View on GitHub
Tile-based language built for AI computation across all scales
☆176Updated this week
yifuwang / symm-mem-recipes
View on GitHub
☆170Dec 27, 2024Updated last year
Mark-ThinkPad / TCP_Robot
View on GitHub
计算机网络课程设计, 基于TCP协议的简易聊天机器人, 开发语言Python3, 初期版本只能在终端中运行(CLI), 最终完成版为客户端编写了"简陋"的图形界面, 使用Qt5(即PyQt5)实现
☆10Jun 17, 2019Updated 7 years ago
wu-kan / wuk_cupti_wrapper
View on GitHub
a simple API to use CUPTI
☆10Aug 19, 2025Updated 11 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ssbuild / aigc_evals
View on GitHub
aigc evals
☆10Dec 2, 2023Updated 2 years ago
xpan413 / FSMoE
View on GitHub
☆16Jan 14, 2025Updated last year
ViffyGwaanl / DeepSeek-Api-Test
View on GitHub
Currently, there are many DeepSeek API providers on the market. Use DeepSeek Api Test to test which API performs the best
☆20Feb 13, 2025Updated last year
willard-yuan / practical-cbir-handbook
View on GitHub
A book tries to give some guide for content-based image retrieval
☆19Oct 16, 2017Updated 8 years ago
SenHe / uavdvsm
View on GitHub
☆15Nov 23, 2020Updated 5 years ago
Jinming-Su / CVDaily
View on GitHub
Store articles for WeChat Public 'CVDaily'
☆11Feb 7, 2018Updated 8 years ago
woshildh / CPN_lsp_pytorch
View on GitHub
This is my implementation of CPN on lsp by Pytorch.
☆11Apr 15, 2019Updated 7 years ago
chinthysl / AdderNetTensorRT
View on GitHub
Nvidia TensorRT implementation of AdderNet for edge deployment
☆10Nov 19, 2020Updated 5 years ago
samolisov / bazel-llvm-bridge
View on GitHub
Bazel repository_rule for using libraries from a local LLVM installation in your BUILD files. Supports LLVM, Clang and MLIR.
☆12Mar 24, 2021Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
stes / saliency
View on GitHub
Implementing Visual Saliency Models
☆13Jan 10, 2018Updated 8 years ago
watersink / enet-as-linux
View on GitHub
基于ncnn的android端的enet分割
☆17Mar 29, 2020Updated 6 years ago
thtang / Sparse-Coding-for-Face-Image-Retrieval
View on GitHub
work in Advanced Topics in Multimedia Analysis and Indexing
☆15Aug 4, 2018Updated 7 years ago
mayank31398 / ladder-residual-inference
View on GitHub
☆14Jul 13, 2025Updated last year
ibaoger / OpenH264Demo
View on GitHub
OpenH264 decode raw h264 demo.
☆10Jul 8, 2017Updated 9 years ago
tanzelin430 / libsmctrl
View on GitHub
libsmctrl论文的复现，添加了python端接口，可以在python端灵活调用接口来分配计算资源
☆12May 21, 2024Updated 2 years ago
amazon-science / mxfp4-llm
View on GitHub
Official implementation for Training LLMs with MXFP4
☆130Apr 25, 2025Updated last year