alpa-projects/alpa

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/alpa-projects/alpa)

alpa-projects / alpa

Training and serving large-scale neural networks with auto parallelization.

☆3,180

Alternatives and similar repositories for alpa

Users that are interested in alpa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

flexflow / flexflow-train
View on GitHub
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
☆1,898Updated this week
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,219Updated this week
FMInference / FlexLLMGen
View on GitHub
Running large language models on a single GPU for throughput-oriented scenarios.
☆9,363Oct 28, 2024Updated last year
NVIDIA / FasterTransformer
View on GitHub
Transformer related optimization, including BERT, GPT
☆6,445Mar 27, 2024Updated 2 years ago
facebookresearch / fairscale
View on GitHub
PyTorch extensions for high performance and large scale training.
☆3,411Apr 26, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,789Updated this week
pytorch / PiPPy
View on GitHub
Pipeline Parallelism for PyTorch
☆786Aug 21, 2024Updated last year
facebookresearch / metaseq
View on GitHub
Repo for external large-scale work
☆6,552Apr 27, 2024Updated 2 years ago
NVIDIA / TransformerEngine
View on GitHub
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on H…
☆3,448Updated this week
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,539Updated this week
deepspeedai / DeepSpeed-MII
View on GitHub
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
☆2,108Jun 30, 2025Updated last year
awslabs / slapo
View on GitHub
A schedule language for large model training
☆153Aug 21, 2025Updated 11 months ago
deepspeedai / DeepSpeed
View on GitHub
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆42,813Updated this week
merrymercy / awesome-tensor-compilers
View on GitHub
A list of awesome compiler projects and papers for tensor computation and deep learning.
☆2,769Oct 19, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
microsoft / nnfusion
View on GitHub
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
☆1,002Sep 19, 2024Updated last year
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆6,037Updated this week
microsoft / Tutel
View on GitHub
Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4
☆1,001Updated this week
facebookincubator / AITemplate
View on GitHub
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…
☆4,727Jul 14, 2026Updated last week
bitsandbytes-foundation / bitsandbytes
View on GitHub
Accessible large language models via k-bit quantization for PyTorch.
☆8,341Updated this week
ELS-RD / kernl
View on GitHub
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…
☆1,585Jan 28, 2026Updated 5 months ago
sail-sg / zero-bubble-pipeline-parallelism
View on GitHub
Zero Bubble Pipeline Parallelism
☆464May 7, 2025Updated last year
apache / tvm
View on GitHub
Open Machine Learning Compiler Framework
☆13,612Updated this week
pytorch / torchdynamo
View on GitHub
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
☆1,078Apr 17, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,498Jul 20, 2026Updated last week
ModelTC / LightLLM
View on GitHub
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…
☆4,194Updated this week
ConnollyLeon / awesome-Auto-Parallelism
View on GitHub
A baseline repository of Auto-Parallelism in Training Neural Networks
☆145Jun 25, 2022Updated 4 years ago
facebookresearch / xformers
View on GitHub
Hackable and optimized Transformers building blocks, supporting a composable construction.
☆10,534Jul 15, 2026Updated last week
CarperAI / trlx
View on GitHub
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,753Jan 8, 2024Updated 2 years ago
hidet-org / hidet
View on GitHub
An open-source efficient deep learning framework/compiler, written in python.
☆743Sep 4, 2025Updated 10 months ago
alibaba / BladeDISC
View on GitHub
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
☆932Dec 30, 2024Updated last year
punica-ai / punica
View on GitHub
Serving multiple LoRA finetuned LLM as one
☆1,168May 8, 2024Updated 2 years ago
skypilot-org / skypilot
View on GitHub
The AI Compute Platform for frontier teams. SkyPilot turns fragmented AI compute into one AI supercomputer, so frontier AI teams build cu…
☆10,405Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
deepspeedai / Megatron-DeepSpeed
View on GitHub
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆2,257Aug 14, 2025Updated 11 months ago
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,132Updated this week
huggingface / text-generation-inference
View on GitHub
Large Language Model Text Generation Inference
☆10,882Mar 21, 2026Updated 4 months ago
petuum / adaptdl
View on GitHub
Resource-adaptive cluster scheduler for deep learning training.
☆459Mar 5, 2023Updated 3 years ago
bytedance / flux
View on GitHub
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
☆1,346Aug 28, 2025Updated 10 months ago
ray-project / ray
View on GitHub
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
☆43,356Updated this week
databricks / megablocks
View on GitHub
☆1,583Mar 25, 2026Updated 4 months ago