jinbooooom/ai-infra-hpc

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jinbooooom/ai-infra-hpc)

jinbooooom / ai-infra-hpc

hpc 教程，包含集合通信(mpi、nccl)、cuda 编程、向量化 SIMD、RDMA 通信等

☆616

Alternatives and similar repositories for ai-infra-hpc

Users that are interested in ai-infra-hpc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jinbooooom / OriginDL
View on GitHub
Implement a Pytorch-like DL library in C++ from scratch, step by step
☆339Apr 15, 2026Updated 3 months ago
CalvinXKY / InfraTech
View on GitHub
分享AI Infra知识&代码练习：PyTorch、vLLM/SGLang、slime/vime框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等
☆3,135Updated this week
xlite-dev / LeetCUDA
View on GitHub
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
☆11,631Updated this week
Tongkaio / CUDA_Kernel_Samples
View on GitHub
CUDA 算子手撕与面试指南
☆1,044Aug 23, 2025Updated 11 months ago
Infrasys-AI / AIInfra
View on GitHub
AIInfra（AI 基础设施）指AI系统从底层芯片等硬件，到上层软件栈支持AI大模型训练和推理。
☆7,714Dec 22, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
caomaolufei / AIInfraGuide
View on GitHub
AI Infra 全栈从0入门学习资料：https://caomaolufei.github.io/AIInfraGuide/
☆1,381Updated this week
InfiniTensor / InfiniTrain
View on GitHub
☆46Updated this week
Tencent / hpc-ops
View on GitHub
High Performance LLM Inference Operator Library
☆1,065Updated this week
CalvinXKY / BasicCUDA
View on GitHub
A tutorial for CUDA&PyTorch
☆478Mar 23, 2026Updated 4 months ago
WingEdge777 / vitamin-cuda
View on GitHub
🍎 One kernel a day keeps high latency away. A hands-on CUDA learning path featuring a rich collection of kernels, from the basics to pea…
☆184Jul 19, 2026Updated last week
GeeeekExplorer / nano-vllm
View on GitHub
Nano vLLM
☆14,635Apr 26, 2026Updated 3 months ago
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆6,032Updated this week
ForceInjection / AI-fundamentals
View on GitHub
AI 基础知识 - GPU 架构、CUDA 编程、大模型基础及AI Agent 相关知识。
☆1,972Updated this week
sgl-project / mini-sglang
View on GitHub
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
☆4,628May 17, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
NKKdev / TFFinfer
View on GitHub
LLM Inference FrameWork
☆30May 20, 2026Updated 2 months ago
melonedo / algebraic-layouts
View on GitHub
☆23Aug 20, 2025Updated 11 months ago
RL-Align / RL-Kernel
View on GitHub
High-performance RL post-training infrastructure. Designed to achieve bitwise operator-level train-inference consistency across heterogen…
☆213Updated this week
zhaochenyang20 / Awesome-ML-SYS-Tutorial
View on GitHub
My learning notes for ML SYS.
☆6,772Updated this week
PaddleJitLab / CUDATutorial
View on GitHub
A self-learning tutorail for CUDA High Performance Programing.
☆1,052Jan 14, 2026Updated 6 months ago
cr7258 / ai-infra-learning
View on GitHub
This repository organizes materials, recordings, and schedules related to AI-infra learning meetings.
☆531Mar 1, 2026Updated 4 months ago
lzyrapx / LeetGPU
View on GitHub
🌈 Solutions of LeetGPU
☆94Jun 11, 2026Updated last month
gxinlong / cuda-optimization-skill
View on GitHub
A skill for automatically optimizing CUDA code.
☆42Mar 26, 2026Updated 4 months ago
harleyszhang / lite_llama
View on GitHub
A light llama-like llm inference framework based on the triton kernel.
☆188Jan 5, 2026Updated 6 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,125Updated this week
william20001120 / CUDA_Kernel_Learn
View on GitHub
学习与实践 CUDA Kernel 优化的示例仓库，涵盖矩阵乘法（SGEMM）、矩阵转置、各类归约（sum/max/softmax/矩阵 softmax）、GEMV、逐元素算子、LayerNorm，以及 cuBLAS 对比与若干入门示例。目标是以循序渐进的方式，拆解典型优化…
☆26Aug 20, 2025Updated 11 months ago
DD-DuDa / Cute-Learning
View on GitHub
Examples of CUDA implementations by Cutlass CuTe
☆280Jul 1, 2025Updated last year
ArthurinRUC / cutlass-notes
View on GitHub
From Minimal GEMM to Everything
☆227Jul 9, 2026Updated 2 weeks ago
BBuf / AI-Infra-Auto-Driven-SKILLS
View on GitHub
☆696Jul 14, 2026Updated last week
out-or-outstanding / tinyvllm
View on GitHub
复现 nanovllm并增加注释
☆15Jan 27, 2026Updated 5 months ago
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆5,999Updated this week
infinigence / FlashOverlap
View on GitHub
A lightweight design for computation-communication overlap.
☆242Jan 20, 2026Updated 6 months ago
ifromeast / cuda_learning
View on GitHub
learning how CUDA works
☆399Mar 3, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sii-research / VCCL
View on GitHub
Venus Collective Communication Library, supported by SII and Infrawaves.
☆152Jun 24, 2026Updated last month
gpu-mode / lectures
View on GitHub
Material for gpu-mode lectures
☆6,355Jun 15, 2026Updated last month
Wenyueh / MinivLLM
View on GitHub
Based on Nano-vLLM, a simple replication of vLLM with self-contained paged attention and flash attention implementation
☆939Updated this week
pacoxu / AI-Infra
View on GitHub
init to record my learning path of AI Infra, especially on inference.
☆247Updated this week
smart-lty / nano-PEARL
View on GitHub
Draft-Target Disaggregation LLM Serving System via Parallel Speculative Decoding.
☆211Mar 18, 2026Updated 4 months ago
KuangjuX / cuda-evolve-oss
View on GitHub
Autonomous GPU kernel optimization system driven by AI agents.
☆31Mar 29, 2026Updated 3 months ago
tile-ai / tilelang
View on GitHub
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
☆6,908Updated this week