OpenSQZ / MegatronAppLinks
Toolchain built around the Megatron-LM for Distributed Training
☆70Updated this week
Alternatives and similar repositories for MegatronApp
Users that are interested in MegatronApp are comparing it to the libraries listed below
Sorting:
- ☆97Updated 7 months ago
 - A simple calculation for LLM MFU.☆48Updated last month
 - ByteCheckpoint: An Unified Checkpointing Library for LFMs☆249Updated 3 months ago
 - ☆32Updated this week
 - Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocatio…☆63Updated last month
 - DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆65Updated last week
 - Estimate MFU for DeepSeekV3☆26Updated 9 months ago
 - Odysseus: Playground of LLM Sequence Parallelism☆78Updated last year
 - ☆79Updated 6 months ago
 - Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training☆217Updated last year
 - Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆225Updated last week
 - Allow torch tensor memory to be released and resumed later☆160Updated this week
 - Bridge Megatron-Core to Hugging Face/Reinforcement Learning☆143Updated this week
 - Make SGLang go brrr☆40Updated last week
 - ☆102Updated 5 months ago
 - APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM tra…☆37Updated 3 weeks ago
 - DeeperGEMM: crazy optimized version☆72Updated 5 months ago
 - Training library for Megatron-based models☆158Updated this week
 - PyTorch bindings for CUTLASS grouped GEMM.☆125Updated 5 months ago
 - A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆68Updated 2 months ago
 - JAX backend for SGL☆101Updated this week
 - ☆101Updated last year
 - PyTorch bindings for CUTLASS grouped GEMM.☆161Updated 3 weeks ago
 - LLM Serving Performance Evaluation Harness☆80Updated 8 months ago
 - A Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters☆52Updated last year
 - ☆130Updated 5 months ago
 - ☆121Updated last year
 - Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.☆123Updated this week
 - ☆43Updated last year
 - ☆310Updated last month