HuaizhengZhang/AI-Infra-from-Zero-to-Hero

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HuaizhengZhang/AI-Infra-from-Zero-to-Hero)

HuaizhengZhang / AI-Infra-from-Zero-to-Hero

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. 🗃️ Llama3, Mistral, etc. 🧑‍💻 Video Tutorials.

☆4,246

Alternatives and similar repositories for AI-Infra-from-Zero-to-Hero

Users that are interested in AI-Infra-from-Zero-to-Hero are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

microsoft / AI-System
View on GitHub
System for AI Education Resource.
☆4,324Oct 25, 2024Updated last year
openmlsys / openmlsys
View on GitHub
《Machine Learning Systems: Design and Implementation》 (V2 is launching soon）
☆4,833Mar 15, 2026Updated 4 months ago
ucbrise / cs294-ai-sys-sp19
View on GitHub
CS294; AI For Systems and Systems For AI
☆225Aug 30, 2019Updated 6 years ago
merrymercy / awesome-tensor-compilers
View on GitHub
A list of awesome compiler projects and papers for tensor computation and deep learning.
☆2,770Oct 19, 2024Updated last year
zhaochenyang20 / Awesome-ML-SYS-Tutorial
View on GitHub
My learning notes for ML SYS.
☆6,801Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
AmberLJC / LLMSys-PaperList
View on GitHub
Large Language Model (LLM) Systems Paper List
☆2,207Jul 25, 2026Updated last week
xlite-dev / Awesome-LLM-Inference
View on GitHub
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
☆5,435Jul 26, 2026Updated last week
mcanini / SysML-reading-list
View on GitHub
Systems for ML/AI & ML/AI for Systems paper reading list: A curated reading list of computer science research for work at the intersectio…
☆287Jun 9, 2025Updated last year
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆6,079Updated this week
BBuf / tvm_mlir_learn
View on GitHub
compiler learning resources collect.
☆2,758May 20, 2026Updated 2 months ago
tqchen / tinyflow
View on GitHub
Tutorial code on how to build your own Deep Learning System in 2k Lines
☆2,018Oct 4, 2018Updated 7 years ago
mosharaf / eecs598
View on GitHub
Advanced Topics on Systems for X
☆289Jul 10, 2024Updated 2 years ago
bytedance / byteps
View on GitHub
A high performance and generic framework for distributed DNN training
☆3,718Oct 3, 2023Updated 2 years ago
d2l-ai / d2l-tvm
View on GitHub
Dive into Deep Learning Compiler
☆650Jun 19, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
xlite-dev / LeetCUDA
View on GitHub
LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.
☆11,679Updated this week
apache / tvm
View on GitHub
Open Machine Learning Compiler Framework
☆13,635Updated this week
microsoft / nnfusion
View on GitHub
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
☆1,002Sep 19, 2024Updated last year
Jack47 / hack-SysML
View on GitHub
The road to hack SysML and become an system expert
☆518Sep 25, 2024Updated last year
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,504Jul 20, 2026Updated last week
BBuf / how-to-optim-algorithm-in-cuda
View on GitHub
how to optimize some algorithm in cuda.
☆3,176Updated this week
AmadeusChan / Awesome-LLM-System-Papers
View on GitHub
☆645Jan 14, 2026Updated 6 months ago
guanh01 / CS692-mlsys
View on GitHub
This is the (evolving) reading list for the seminar.
☆62Nov 4, 2020Updated 5 years ago
flexflow / flexflow-train
View on GitHub
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
☆1,898Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
dyweb / papers-notebook
View on GitHub
论文阅读笔记（分布式系统、虚拟化、机器学习）Papers Notebook (Distributed System, Virtualization, Machine Learning)
☆2,206Jun 1, 2022Updated 4 years ago
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆6,119Updated this week
gpu-mode / lectures
View on GitHub
Material for gpu-mode lectures
☆6,383Jun 15, 2026Updated last month
alpa-projects / alpa
View on GitHub
Training and serving large-scale neural networks with auto parallelization.
☆3,180Dec 9, 2023Updated 2 years ago
bytedance / flux
View on GitHub
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
☆1,350Aug 28, 2025Updated 11 months ago
lambda7xx / awesome-AI-system
View on GitHub
paper and its code for AI System
☆377May 14, 2026Updated 2 months ago
ModelTC / LightLLM
View on GitHub
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…
☆4,202Updated this week
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,175Updated this week
mit-han-lab / inter-operator-scheduler
View on GitHub
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
☆201Apr 27, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,286Updated this week
NVIDIA / FasterTransformer
View on GitHub
Transformer related optimization, including BERT, GPT
☆6,445Mar 27, 2024Updated 2 years ago
alibaba / BladeDISC
View on GitHub
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
☆931Dec 30, 2024Updated last year
volcengine / veScale
View on GitHub
Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs
☆1,033Mar 3, 2026Updated 4 months ago
GeeeekExplorer / nano-vllm
View on GitHub
Nano vLLM
☆14,749Apr 26, 2026Updated 3 months ago
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,830Updated this week
MLSys-Learner-Resources / Awesome-MLSys-Blogger
View on GitHub
The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)
☆341Jan 5, 2025Updated last year