Ongoing research training transformer models at scale
☆18May 27, 2026Updated this week
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pipeline Parallelism Emulation and Visualization☆83Jan 8, 2026Updated 4 months ago
- LLM training technologies developed by kwai☆72Jan 21, 2026Updated 4 months ago
- Multiple GEMM operators are constructed with cutlass to support LLM inference.☆20Aug 3, 2025Updated 9 months ago
- ☆47Sep 8, 2025Updated 8 months ago
- Estimate MFU for DeepSeekV3☆26Jan 5, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆97Jan 16, 2026Updated 4 months ago
- The dataset and baseline code for ASC23 LLM inference optimization challenge.☆34Dec 20, 2023Updated 2 years ago
- Allow torch tensor memory to be released and resumed later☆245May 16, 2026Updated last week
- Zero Bubble Pipeline Parallelism☆456May 7, 2025Updated last year
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆16Feb 4, 2025Updated last year
- [ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…☆13Apr 17, 2025Updated last year
- a simple API to use CUPTI☆10Aug 19, 2025Updated 9 months ago
- Open-source toolkit for training, Priming, and serving next generation Hybrid architectures☆70May 9, 2026Updated 2 weeks ago
- bazel build rules for creating ebooks in PDF, EPUB and MOBI format☆12May 17, 2026Updated last week
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Example of binding a TF32 CUTLASS GEMM kernel to PyTorch☆12Jun 7, 2024Updated last year
- Generates a systags file for Vim use.☆10Mar 2, 2020Updated 6 years ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆24May 20, 2026Updated last week
- A Light CNN Framework!☆16Apr 8, 2019Updated 7 years ago
- Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.☆192May 20, 2026Updated last week
- Currently, there are many DeepSeek API providers on the market. Use DeepSeek Api Test to test which API performs the best☆20Feb 13, 2025Updated last year
- ☆167Dec 27, 2024Updated last year
- 半导体器件物理 LaTeX笔记☆13Apr 19, 2025Updated last year
- A Python Library for the 3GPP physical layer☆17Dec 18, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- libsmctrl论文的复现,添加了python端接口,可以在python端灵活调用接口来分配计算资源☆12May 21, 2024Updated 2 years ago
- 📦 A Command Line Tool for downloading protein structures, sequences and MSAs☆10Nov 21, 2017Updated 8 years ago
- To pioneer training long-context multi-modal transformer models☆73Aug 8, 2025Updated 9 months ago
- Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.☆18Jan 15, 2025Updated last year
- ☆24May 9, 2025Updated last year
- A book tries to give some guide for content-based image retrieval☆19Oct 16, 2017Updated 8 years ago
- ☆15Nov 23, 2020Updated 5 years ago
- Tile-based language built for AI computation across all scales☆153May 19, 2026Updated last week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Implementing Visual Saliency Models☆13Jan 10, 2018Updated 8 years ago
- work in Advanced Topics in Multimedia Analysis and Indexing☆15Aug 4, 2018Updated 7 years ago
- Bazel repository_rule for using libraries from a local LLVM installation in your BUILD files. Supports LLVM, Clang and MLIR.☆12Mar 24, 2021Updated 5 years ago
- [ACL 2026 🔥] CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark☆34Apr 20, 2026Updated last month
- ☆20Apr 24, 2026Updated last month
- ☆14Jul 13, 2025Updated 10 months ago
- 基于ncnn的android端的enet分割☆17Mar 29, 2020Updated 6 years ago