π Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support
β568Jun 8, 2026Updated this week
Alternatives and similar repositories for Automodel
Users that are interested in Automodel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Training library for Megatron-based models with bidirectional Hugging Face conversion capabilityβ691Updated this week
- Scalable toolkit for efficient model reinforcementβ1,711Updated this week
- Open-source toolkit for training, Priming, and serving next generation Hybrid architecturesβ71May 9, 2026Updated last month
- Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.β18Jan 15, 2025Updated last year
- Accelerating MoE with IO and Tile-aware Optimizationsβ707May 14, 2026Updated 3 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed libraryβ52Aug 20, 2025Updated 9 months ago
- State-of-the-art framework for fast, large-scale training and inference of diffusion modelsβ54May 20, 2026Updated 3 weeks ago
- Minimalistic large language model 3D-parallelism trainingβ2,711May 26, 2026Updated 2 weeks ago
- Scalable data pre processing and curation toolkit for LLMsβ1,601Updated this week
- The tool facilitates debugging convergence issues and testing new algorithms and recipes for training LLMs using Nvidia libraries such asβ¦β20Sep 17, 2025Updated 8 months ago
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.coβ¦β16Jan 16, 2026Updated 4 months ago
- Bridge Megatron-Core to Hugging Face/Reinforcement Learningβ215Jun 2, 2026Updated last week
- Efficient Long-context Language Model Training by Core Attention Disaggregationβ105Apr 7, 2026Updated 2 months ago
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hβ¦β3,381Updated this week
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Ongoing research training transformer models at scaleβ18Updated this week
- A PyTorch native platform for training generative AI modelsβ5,416Updated this week
- GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 trβ¦β337Nov 11, 2025Updated 6 months ago
- End-to-end pipeline for PPIFlowβ23Feb 17, 2026Updated 3 months ago
- Byted PyTorch Distributed for Hyperscale Training of LLMs and RLsβ1,021Mar 3, 2026Updated 3 months ago
- Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.β1,523Updated this week
- β50May 20, 2025Updated last year
- β33Apr 19, 2025Updated last year
- Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDPβ108Aug 20, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programsβ193May 22, 2026Updated 2 weeks ago
- NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process comβ¦β540May 30, 2026Updated last week
- [NAACL'25 π SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expertβ¦β16Feb 4, 2025Updated last year
- A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cachβ¦β61Oct 27, 2025Updated 7 months ago
- a simple API to use CUPTIβ10Aug 19, 2025Updated 9 months ago
- β33Dec 31, 2025Updated 5 months ago
- Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocatioβ¦β108Sep 11, 2025Updated 8 months ago
- Ongoing research training transformer models at scaleβ16,617Updated this week
- A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from traininβ¦β158May 25, 2026Updated 2 weeks ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- β11Apr 16, 2026Updated last month
- A tool to configure, launch and manage your machine learning experiments.β244Jun 3, 2026Updated last week
- Video Diffusion Transformers are In-Context Learnersβ37Jan 6, 2025Updated last year
- Composable and Embeddable Communication Runtime for Distributed AI Servicesβ101Updated this week
- A lightweight, user-friendly data-plane for LLM training.β39Sep 10, 2025Updated 9 months ago
- A lightweight, AI-native training framework for large language models. Designed for fast iteration, reproducible experiments, and modularβ¦β573May 18, 2026Updated 3 weeks ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scalingβ24Jun 3, 2026Updated last week