cr7258/ai-infra-learning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cr7258/ai-infra-learning)

cr7258 / ai-infra-learning

This repository organizes materials, recordings, and schedules related to AI-infra learning meetings.

☆527

Alternatives and similar repositories for ai-infra-learning

Users that are interested in ai-infra-learning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CalvinXKY / InfraTech
View on GitHub
分享AI Infra知识&代码练习：PyTorch、vLLM/SGLang、slime/vime框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等
☆3,015Jul 2, 2026Updated 2 weeks ago
xlite-dev / LeetCUDA
View on GitHub
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
☆11,599Updated this week
Infrasys-AI / AIInfra
View on GitHub
AIInfra（AI 基础设施）指AI系统从底层芯片等硬件，到上层软件栈支持AI大模型训练和推理。
☆7,666Dec 22, 2025Updated 6 months ago
gogongxt / nano-vllm
View on GitHub
Nano vLLM
☆25Aug 11, 2025Updated 11 months ago
GeeeekExplorer / nano-vllm
View on GitHub
Nano vLLM
☆14,582Apr 26, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
sgl-project / mini-sglang
View on GitHub
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
☆4,616May 17, 2026Updated 2 months ago
Wenyueh / MinivLLM
View on GitHub
Based on Nano-vLLM, a simple replication of vLLM with self-contained paged attention and flash attention implementation
☆926Updated this week
gogongxt / nano-sglang
View on GitHub
☆160Mar 5, 2026Updated 4 months ago
ForceInjection / AI-fundamentals
View on GitHub
AI 基础知识 - GPU 架构、CUDA 编程、大模型基础及AI Agent 相关知识。
☆1,922Updated this week
zjhellofss / KuiperLLama
View on GitHub
校招、秋招、春招、实习好项目，带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
☆552Oct 28, 2025Updated 8 months ago
zhaochenyang20 / Awesome-ML-SYS-Tutorial
View on GitHub
My learning notes for ML SYS.
☆6,759Updated this week
caomaolufei / AIInfraGuide
View on GitHub
AI Infra 全栈从0入门学习资料：https://caomaolufei.github.io/AIInfraGuide/
☆1,316Jul 10, 2026Updated last week
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆5,941Updated this week
wyann22 / aios
View on GitHub
☆116Jul 13, 2026Updated last week
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
jinbooooom / ai-infra-hpc
View on GitHub
hpc 教程，包含集合通信(mpi、nccl)、cuda 编程、向量化 SIMD、RDMA 通信等
☆614Apr 27, 2026Updated 2 months ago
CyperPan / interview4AI
View on GitHub
interview question for AI infra
☆18Mar 22, 2026Updated 3 months ago
LDLINGLINGLING / nano_vllm_note
View on GitHub
注释的nano_vllm仓库，并且完成了MiniCPM4的适配以及注册新模型的功能
☆198Aug 11, 2025Updated 11 months ago
lumia431 / photon_infer
View on GitHub
A High-Performance LLM Inference Engine with vLLM-Style Continuous Batching
☆118Jan 2, 2026Updated 6 months ago
difey / nano-vllm-v1
View on GitHub
Nano vLLM v1 engine
☆16Aug 6, 2025Updated 11 months ago
memory-of-star / OpenSUN
View on GitHub
An open platform for exploring scale-up network systems.
☆17Mar 16, 2026Updated 4 months ago
TheToughCrane / nano-kvllm
View on GitHub
This project aims to provide a high effective KV cache manage framework for llm inference and improve memory utilization and inference sp…
☆67Apr 24, 2026Updated 2 months ago
Starmys / TritonStudyGroup
View on GitHub
☆133Sep 22, 2025Updated 9 months ago
harleyszhang / lite_llama
View on GitHub
A light llama-like llm inference framework based on the triton kernel.
☆188Jan 5, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Chtholly-Boss / swizzle
View on GitHub
A practical way of learning Swizzle
☆42Feb 3, 2025Updated last year
InftyAI / Awesome-LLMOps
View on GitHub
🎉 An awesome & curated list of best LLMOps tools.
☆253Updated this week
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆86,804Updated this week
BBuf / how-to-optim-algorithm-in-cuda
View on GitHub
how to optimize some algorithm in cuda.
☆3,142Updated this week
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆5,994Updated this week
higress-group / mock-server
View on GitHub
An LLM Mock Server that supports simulating the protocols of all LLM providers.
☆15Jul 10, 2026Updated last week
CalvinXKY / BasicCUDA
View on GitHub
A tutorial for CUDA&PyTorch
☆475Mar 23, 2026Updated 3 months ago
HuaizhengZhang / AI-Infra-from-Zero-to-Hero
View on GitHub
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Mod…
☆4,214Jul 25, 2025Updated 11 months ago
HJCheng0602 / nanoPD
View on GitHub
A from-scratch Prefill/Decode disaggregation inference engine for LLMs
☆157May 10, 2026Updated 2 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
kaist-ina / ns3-tlt-rdma-public
View on GitHub
This is an official GitHub repository for the paper, "Towards timeout-less transport in commodity datacenter networks.".
☆15Sep 7, 2022Updated 3 years ago
BBuf / AI-Infra-Auto-Driven-SKILLS
View on GitHub
☆692Jul 14, 2026Updated last week
PaddleJitLab / CUDATutorial
View on GitHub
A self-learning tutorail for CUDA High Performance Programing.
☆1,048Jan 14, 2026Updated 6 months ago
Infrasys-AI / AISystem
View on GitHub
AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
☆17,237Sep 3, 2025Updated 10 months ago
KuangjuX / NVSHMEM-Tutorial
View on GitHub
NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer
☆195Feb 11, 2026Updated 5 months ago
LeeWant / FirstQuantization
View on GitHub
A case study of quantitative modeling for beginners.
☆23Jan 26, 2026Updated 5 months ago
pacoxu / AI-Infra
View on GitHub
init to record my learning path of AI Infra, especially on inference.
☆244Updated this week