π Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support
β505May 16, 2026Updated this week
Alternatives and similar repositories for Automodel
Users that are interested in Automodel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Training library for Megatron-based models with bidirectional Hugging Face conversion capabilityβ642Updated this week
- Scalable toolkit for efficient model reinforcementβ1,635Updated this week
- Open-source toolkit for training, Priming, and serving next generation Hybrid architecturesβ69May 9, 2026Updated last week
- Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.β18Jan 15, 2025Updated last year
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed libraryβ52Aug 20, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Minimalistic large language model 3D-parallelism trainingβ2,690Apr 7, 2026Updated last month
- The tool facilitates debugging convergence issues and testing new algorithms and recipes for training LLMs using Nvidia libraries such asβ¦β19Sep 17, 2025Updated 8 months ago
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.coβ¦β15Jan 16, 2026Updated 4 months ago
- Bridge Megatron-Core to Hugging Face/Reinforcement Learningβ211Updated this week
- Efficient Long-context Language Model Training by Core Attention Disaggregationβ103Apr 7, 2026Updated last month
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hβ¦β3,340Updated this week
- Ongoing research training transformer models at scaleβ18Updated this week
- Scalable data pre processing and curation toolkit for LLMsβ1,569May 13, 2026Updated last week
- GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 trβ¦β334Nov 11, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.β1,340Updated this week
- A PyTorch native platform for training generative AI modelsβ5,339May 14, 2026Updated last week
- Accelerating MoE with IO and Tile-aware Optimizationsβ684May 14, 2026Updated last week
- The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1β68May 8, 2026Updated last week
- Byted PyTorch Distributed for Hyperscale Training of LLMs and RLsβ1,009Mar 3, 2026Updated 2 months ago
- β49May 20, 2025Updated last year
- β33Apr 19, 2025Updated last year
- Tools and Scripts for running Isaac Sim workloads on Omniverse Farmβ12Jun 15, 2025Updated 11 months ago
- Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDPβ107Aug 20, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process comβ¦β532May 5, 2026Updated 2 weeks ago
- [NAACL'25 π SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expertβ¦β16Feb 4, 2025Updated last year
- A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cachβ¦β61Oct 27, 2025Updated 6 months ago
- β33Dec 31, 2025Updated 4 months ago
- Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocatioβ¦β107Sep 11, 2025Updated 8 months ago
- A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from traininβ¦β150Apr 22, 2026Updated 3 weeks ago
- Ongoing research training transformer models at scaleβ16,340Updated this week
- A tool to configure, launch and manage your machine learning experiments.β242Updated this week
- Video Diffusion Transformers are In-Context Learnersβ36Jan 6, 2025Updated last year
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Composable and Embeddable Communication Runtime for Distributed AI Servicesβ101May 14, 2026Updated last week
- A lightweight, user-friendly data-plane for LLM training.β39Sep 10, 2025Updated 8 months ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scalingβ24May 14, 2026Updated last week
- A lightweight, AI-native training framework for large language models. Designed for fast iteration, reproducible experiments, and modularβ¦β568Updated this week
- β189Updated this week
- Ship correct and fast LLM kernels to PyTorchβ150Jan 14, 2026Updated 4 months ago
- Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.β192Updated this week