Toolchain built around the Megatron-LM for Distributed Training
☆88Dec 7, 2025Updated 2 months ago
Alternatives and similar repositories for MegatronApp
Users that are interested in MegatronApp are comparing it to the libraries listed below
Sorting:
- Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocatio…☆88Sep 11, 2025Updated 5 months ago
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆93Jan 16, 2026Updated last month
- LoRAFusion: Efficient LoRA Fine-Tuning for LLMs☆23Sep 23, 2025Updated 5 months ago
- Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.☆55Nov 11, 2025Updated 3 months ago
- ☆49Sep 26, 2025Updated 5 months ago
- ☆42Sep 8, 2025Updated 5 months ago
- Tiny-Megatron, a minimalistic re-implementation of the Megatron library☆23Sep 1, 2025Updated 6 months ago
- Official implementation of TBA for async LLM post-training.☆29Nov 5, 2025Updated 3 months ago
- ☆19May 11, 2024Updated last year
- ☆38Aug 7, 2025Updated 6 months ago
- Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning☆28Jul 14, 2025Updated 7 months ago
- Package of Pathways-on-Cloud utilities☆25Updated this week
- Allow torch tensor memory to be released and resumed later☆220Feb 9, 2026Updated 2 weeks ago
- DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit☆92Jan 26, 2026Updated last month
- A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training☆650Updated this week
- Pipeline Parallelism Emulation and Visualization☆79Jan 8, 2026Updated last month
- Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs☆938Nov 27, 2025Updated 3 months ago
- Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.☆166Jan 22, 2026Updated last month
- ☆87Aug 16, 2025Updated 6 months ago
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library☆50Aug 20, 2025Updated 6 months ago
- Tutorials for NVIDIA CUPTI samples☆55Nov 3, 2025Updated 3 months ago
- LLM training technologies developed by kwai☆70Jan 21, 2026Updated last month
- Training library for Megatron-based models with bidirectional Hugging Face conversion capability☆459Updated this week
- A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cach…☆58Oct 27, 2025Updated 4 months ago
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆122Nov 15, 2023Updated 2 years ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆264Updated this week
- ☆51Apr 30, 2025Updated 10 months ago
- ☆34Sep 14, 2024Updated last year
- NexRL is an ultra-loosely-coupled LLM post-training framework.☆98Feb 14, 2026Updated 2 weeks ago
- torchcomms: a modern PyTorch communications API☆338Updated this week
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆62Dec 19, 2025Updated 2 months ago
- Artifact from "Hardware Compute Partitioning on NVIDIA GPUs". THIS IS A FORK OF BAKITAS REPO. I AM NOT ONE OF THE AUTHORS OF THE PAPER.☆55Nov 24, 2025Updated 3 months ago
- NVIDIA Inference Xfer Library (NIXL)☆898Updated this week
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Jul 4, 2025Updated 7 months ago
- An alternative to OpenFaaS nats-queue-worker for long-running functions☆11Dec 14, 2022Updated 3 years ago
- This project compares the performance of Swin-Transformer v2 implemented in JAX and PyTorch.☆12Jun 8, 2022Updated 3 years ago
- ☆33Dec 23, 2025Updated 2 months ago
- Nonblocking data structures☆12Jan 25, 2015Updated 11 years ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 4 months ago