OpenSQZ / MegatronAppLinks
Toolchain built around the Megatron-LM for Distributed Training
☆65Updated 2 weeks ago
Alternatives and similar repositories for MegatronApp
Users that are interested in MegatronApp are comparing it to the libraries listed below
Sorting:
- A simple calculation for LLM MFU.☆45Updated last week
- ☆95Updated 5 months ago
- Utility scripts for PyTorch (e.g. Memory profiler that understands more low-level allocations such as NCCL)☆53Updated last week
- ByteCheckpoint: An Unified Checkpointing Library for LFMs☆245Updated 2 months ago
- Bridge Megatron-Core to Hugging Face/Reinforcement Learning☆126Updated this week
- JAX backend for SGL☆60Updated this week
- PyTorch bindings for CUTLASS grouped GEMM.☆119Updated 3 months ago
- Odysseus: Playground of LLM Sequence Parallelism☆77Updated last year
- PyTorch bindings for CUTLASS grouped GEMM.☆146Updated 3 weeks ago
- Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training☆216Updated last year
- Allow torch tensor memory to be released and resumed later☆135Updated last week
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆51Updated last week
- Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.☆93Updated this week
- ☆64Updated 4 months ago
- Training library for Megatron-based models☆85Updated this week
- ☆50Updated 4 months ago
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆211Updated 2 weeks ago
- ☆78Updated 5 months ago
- Estimate MFU for DeepSeekV3☆24Updated 8 months ago
- Pipeline Parallelism Emulation and Visualization☆66Updated 3 months ago
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆68Updated 3 weeks ago
- ☆126Updated 3 months ago
- DeeperGEMM: crazy optimized version☆70Updated 4 months ago
- DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit☆62Updated this week
- ☆42Updated last year
- Automated Parallelization System and Infrastructure for Multiple Ecosystems☆79Updated 10 months ago
- ☆98Updated last year
- Make SGLang go brrr☆30Updated last week
- An experimental communicating attention kernel based on DeepEP.☆34Updated last month
- ☆295Updated this week