Toolchain built around the Megatron-LM for Distributed Training
☆92Mar 23, 2026Updated 2 weeks ago
Alternatives and similar repositories for MegatronApp
Users that are interested in MegatronApp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.☆58Mar 4, 2026Updated last month
- Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocatio…☆99Sep 11, 2025Updated 6 months ago
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆95Jan 16, 2026Updated 2 months ago
- Tiny-Megatron, a minimalistic re-implementation of the Megatron library☆23Sep 1, 2025Updated 7 months ago
- ☆19May 11, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆88Aug 16, 2025Updated 7 months ago
- DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit☆95Mar 31, 2026Updated last week
- [EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization☆22Feb 5, 2026Updated 2 months ago
- ☆38Aug 7, 2025Updated 8 months ago
- ☆51Sep 26, 2025Updated 6 months ago
- Allow torch tensor memory to be released and resumed later☆233Mar 10, 2026Updated last month
- Pipeline Parallelism Emulation and Visualization☆81Jan 8, 2026Updated 3 months ago
- ☆26Mar 9, 2026Updated last month
- Tutorials for NVIDIA CUPTI samples☆61Nov 3, 2025Updated 5 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training☆765Updated this week
- A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cach…☆60Oct 27, 2025Updated 5 months ago
- Official implementation of TBA for async LLM post-training.☆29Nov 5, 2025Updated 5 months ago
- ☆46Sep 8, 2025Updated 7 months ago
- To pioneer training long-context multi-modal transformer models☆74Aug 8, 2025Updated 8 months ago
- Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning☆29Jul 14, 2025Updated 8 months ago
- Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs☆1,004Mar 3, 2026Updated last month
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆276Apr 1, 2026Updated last week
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library☆51Aug 20, 2025Updated 7 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.☆182Mar 17, 2026Updated 3 weeks ago
- Codes for MO's Trading☆15Mar 20, 2022Updated 4 years ago
- ☆13May 8, 2023Updated 2 years ago
- Bridge Megatron-Core to Hugging Face/Reinforcement Learning☆205Apr 2, 2026Updated last week
- LLM training technologies developed by kwai☆71Jan 21, 2026Updated 2 months ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Jul 4, 2025Updated 9 months ago
- ☆13Aug 6, 2019Updated 6 years ago
- VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo☆1,794Apr 3, 2026Updated last week
- Sound event detection in real life audio with CNN submitted to DCASE16☆22Jun 10, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Spectral Sphere Optimizer☆114Mar 23, 2026Updated 2 weeks ago
- Training library for Megatron-based models with bidirectional Hugging Face conversion capability☆553Updated this week
- a simple API to use CUPTI☆10Aug 19, 2025Updated 7 months ago
- GPT-jax based on the official huggingface library☆13Jun 22, 2021Updated 4 years ago
- Implementation of the SOTA Transformer architecture from PaLM - Scaling Language Modeling with Pathways in JAX/Flax☆14Jun 22, 2022Updated 3 years ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆22Mar 25, 2026Updated 2 weeks ago
- ☆52Apr 30, 2025Updated 11 months ago