pytorch/torchtitan

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pytorch/torchtitan)

pytorch / torchtitan

A PyTorch native platform for training generative AI models

☆5,549

Alternatives and similar repositories for torchtitan

Users that are interested in torchtitan are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

meta-pytorch / torchtune
View on GitHub
PyTorch native post-training library
☆5,784Updated this week
linkedin / Liger-Kernel
View on GitHub
Efficient Triton Kernels for LLM Training
☆6,530Updated this week
pytorch / ao
View on GitHub
PyTorch native quantization and sparsity for training and inference
☆2,910Updated this week
huggingface / nanotron
View on GitHub
Minimalistic large language model 3D-parallelism training
☆2,760May 26, 2026Updated last month
HazyResearch / ThunderKittens
View on GitHub
Tile primitives for speedy kernels
☆3,555Jul 13, 2026Updated last week
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,140Updated this week
NVIDIA / TransformerEngine
View on GitHub
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on H…
☆3,435Updated this week
fla-org / flash-linear-attention
View on GitHub
🚀 Efficient implementations for emerging model architectures
☆5,388Updated this week
huggingface / picotron
View on GitHub
Minimalistic 4D-parallelism distributed training framework for education purpose
☆2,255Aug 26, 2025Updated 10 months ago
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,502Updated this week
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,746Updated this week
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆5,994Updated this week
meta-pytorch / gpt-fast
View on GitHub
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
☆6,228Aug 22, 2025Updated 10 months ago
zhuzilin / ring-flash-attention
View on GitHub
Ring attention implementation with flash attention
☆1,036Sep 10, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,495Updated this week
meta-pytorch / attention-gym
View on GitHub
Helpful tools and examples for working with flex-attention
☆1,211Updated this week
facebookresearch / lingua
View on GitHub
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
☆4,756Jul 18, 2025Updated last year
databricks / megablocks
View on GitHub
☆1,582Mar 25, 2026Updated 3 months ago
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,587Updated this week
bytedance / flux
View on GitHub
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
☆1,344Aug 28, 2025Updated 10 months ago
pytorch / helion
View on GitHub
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
☆912Updated this week
facebookresearch / xformers
View on GitHub
Hackable and optimized Transformers building blocks, supporting a composable construction.
☆10,526Updated this week
KellerJordan / modded-nanogpt
View on GitHub
NanoGPT (124M) in 90 seconds
☆5,548Jul 3, 2026Updated 2 weeks ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
meta-pytorch / torchft
View on GitHub
Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)
☆524Updated this week
sgl-project / sglang
View on GitHub
SGLang is a high-performance serving framework for large language models and multimodal models.
☆30,583Updated this week
pytorch / PiPPy
View on GitHub
Pipeline Parallelism for PyTorch
☆786Aug 21, 2024Updated last year
ByteDance-Seed / VeOmni
View on GitHub
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
☆2,101Updated this week
THUDM / slime
View on GitHub
slime is an LLM post-training framework for RL Scaling.
☆7,569Updated this week
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,113Updated this week
Dao-AILab / quack
View on GitHub
A Quirky Assortment of CuTe Kernels
☆1,064Updated this week
NVIDIA / TensorRT-LLM
View on GitHub
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…
☆14,170Updated this week
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,831Jul 14, 2026Updated last week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
huggingface / accelerate
View on GitHub
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…
☆9,785Updated this week
huggingface / trl
View on GitHub
Train transformer language models with reinforcement learning.
☆18,898Updated this week
volcengine / veScale
View on GitHub
Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs
☆1,031Mar 3, 2026Updated 4 months ago
facebookresearch / fairscale
View on GitHub
PyTorch extensions for high performance and large scale training.
☆3,411Apr 26, 2025Updated last year
bitsandbytes-foundation / bitsandbytes
View on GitHub
Accessible large language models via k-bit quantization for PyTorch.
☆8,337Updated this week
tile-ai / tilelang
View on GitHub
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
☆6,681Updated this week
allenai / OLMo
View on GitHub
Modeling, training, eval, and inference code for OLMo
☆6,600Nov 24, 2025Updated 7 months ago