distributed trainer for LLMs
☆589May 20, 2024Updated last year
Alternatives and similar repositories for Megatron-LLM
Users that are interested in Megatron-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Best practice for training LLaMA models in Megatron-LM☆663Jan 2, 2024Updated 2 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,437Mar 20, 2024Updated 2 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆69Jul 20, 2023Updated 2 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆2,241Aug 14, 2025Updated 7 months ago
- Ongoing research training transformer models at scale☆15,985Updated this week
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆1,551Dec 15, 2025Updated 3 months ago
- ☆84Sep 9, 2023Updated 2 years ago
- Zero Bubble Pipeline Parallelism☆452May 7, 2025Updated 11 months ago
- Minimalistic large language model 3D-parallelism training☆2,644Updated this week
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,673Mar 8, 2024Updated 2 years ago
- Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads☆2,720Jun 25, 2024Updated last year
- A LLaMA1/LLaMA12 Megatron implement.