lhb8125 / Megatron-LMView external linksLinks
Ongoing research training transformer models at scale
☆18Feb 5, 2026Updated last week
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below
Sorting:
- LLM training technologies developed by kwai☆70Jan 21, 2026Updated 3 weeks ago
- Pipeline Parallelism Emulation and Visualization☆77Jan 8, 2026Updated last month
- Multiple GEMM operators are constructed with cutlass to support LLM inference.☆20Aug 3, 2025Updated 6 months ago
- ☆42Sep 8, 2025Updated 5 months ago
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆93Jan 16, 2026Updated 3 weeks ago
- Estimate MFU for DeepSeekV3☆26Jan 5, 2025Updated last year
- Allow torch tensor memory to be released and resumed later☆217Updated this week
- The dataset and baseline code for ASC23 LLM inference optimization challenge.☆32Dec 20, 2023Updated 2 years ago
- 📦 A Command Line Tool for downloading protein structures, sequences and MSAs☆10Nov 21, 2017Updated 8 years ago
- 半导体器件物理 LaTeX笔记☆12Apr 19, 2025Updated 9 months ago
- a simple API to use CUPTI☆11Aug 19, 2025Updated 5 months ago
- A Light CNN Framework!☆16Apr 8, 2019Updated 6 years ago
- libsmctrl论文的复现,添加了python端接口,可以在python端灵活调用接口来分配计算资源☆12May 21, 2024Updated last year
- Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.☆162Jan 22, 2026Updated 3 weeks ago
- Zero Bubble Pipeline Parallelism☆449May 7, 2025Updated 9 months ago
- [ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…☆12Apr 17, 2025Updated 9 months ago
- ☆15Jul 13, 2025Updated 7 months ago
- Generates a systags file for Vim use.☆10Mar 2, 2020Updated 5 years ago
- Example of binding a TF32 CUTLASS GEMM kernel to PyTorch☆12Jun 7, 2024Updated last year
- ☆15Oct 30, 2025Updated 3 months ago
- Agent framework for generating a synthetic dataset. This will be raw CoT and Reflection output to be cleaned up by a later step.☆15Apr 11, 2025Updated 10 months ago
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- Source Code for Partial Interference☆10Dec 17, 2022Updated 3 years ago
- 基于人脸识别的自动点名程序(带GUI操作界面)☆13Dec 8, 2018Updated 7 years ago
- ☆18Apr 16, 2025Updated 9 months ago
- This is my implementation of CPN on lsp by Pytorch.☆11Apr 15, 2019Updated 6 years ago
- To pioneer training long-context multi-modal transformer models☆72Aug 8, 2025Updated 6 months ago
- Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocatio…☆86Sep 11, 2025Updated 5 months ago
- ☆24May 9, 2025Updated 9 months ago
- bazel build rules for creating ebooks in PDF, EPUB and MOBI format☆13Feb 8, 2026Updated last week
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆14Feb 4, 2025Updated last year
- ☆159Dec 27, 2024Updated last year
- A simple onedrive command line client☆11Sep 2, 2025Updated 5 months ago
- Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.☆19Jan 15, 2025Updated last year
- Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation☆19Jun 11, 2025Updated 8 months ago
- 擅長填表的高木同學☆12Dec 6, 2022Updated 3 years ago
- Bazel repository_rule for using libraries from a local LLVM installation in your BUILD files. Supports LLVM, Clang and MLIR.☆13Mar 24, 2021Updated 4 years ago
- LoRAFusion: Efficient LoRA Fine-Tuning for LLMs☆23Sep 23, 2025Updated 4 months ago
- Store articles for WeChat Public 'CVDaily'☆11Feb 7, 2018Updated 8 years ago