Infrasys-AI/AISystem

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Infrasys-AI/AISystem)

Infrasys-AI / AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

☆17,237

Alternatives and similar repositories for AISystem

Users that are interested in AISystem are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Infrasys-AI / AIInfra
View on GitHub
AIInfra（AI 基础设施）指AI系统从底层芯片等硬件，到上层软件栈支持AI大模型训练和推理。
☆7,666Dec 22, 2025Updated 6 months ago
xlite-dev / LeetCUDA
View on GitHub
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
☆11,599Updated this week
BBuf / tvm_mlir_learn
View on GitHub
compiler learning resources collect.
☆2,759May 20, 2026Updated 2 months ago
openmlsys / openmlsys
View on GitHub
《Machine Learning Systems: Design and Implementation》 (V2 is launching soon）
☆4,824Mar 15, 2026Updated 4 months ago
BBuf / how-to-optim-algorithm-in-cuda
View on GitHub
how to optimize some algorithm in cuda.
☆3,142Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
zjhellofss / KuiperInfer
View on GitHub
校招、秋招、春招、实习好项目！带你从零实现一个高性能的深度学习推理库，支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library st…
☆3,463Jun 22, 2025Updated last year
microsoft / AI-System
View on GitHub
System for AI Education Resource.
☆4,319Oct 25, 2024Updated last year
xlite-dev / Awesome-LLM-Inference
View on GitHub
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
☆5,404Jun 23, 2026Updated 3 weeks ago
liguodongiot / llm-action
View on GitHub
本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）
☆24,765Updated this week
zhaochenyang20 / Awesome-ML-SYS-Tutorial
View on GitHub
My learning notes for ML SYS.
☆6,759Updated this week
chenzomi12 / DeepLearningSystem
View on GitHub
AI Infra主要是指AI的基础建设，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术。
☆267Mar 26, 2024Updated 2 years ago
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆5,994Updated this week
gpu-mode / lectures
View on GitHub
Material for gpu-mode lectures
☆6,334Jun 15, 2026Updated last month
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆5,941Updated this week
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
d2l-ai / d2l-zh
View on GitHub
《动手学深度学习》：面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
☆79,028Jul 30, 2024Updated last year
mli / paper-reading
View on GitHub
深度学习经典、新论文逐段精读
☆33,598Mar 22, 2025Updated last year
sgl-project / sglang
View on GitHub
SGLang is a high-performance serving framework for large language models and multimodal models.
☆30,583Updated this week
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆86,804Updated this week
GeeeekExplorer / nano-vllm
View on GitHub
Nano vLLM
☆14,582Apr 26, 2026Updated 2 months ago
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,140Updated this week
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,502Updated this week
tile-ai / tilelang
View on GitHub
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
☆6,681Updated this week
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,113Updated this week
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
Tony-Tan / CUDA_Freshman
View on GitHub
☆2,780Jan 16, 2024Updated 2 years ago
deepspeedai / DeepSpeed
View on GitHub
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆42,754Updated this week
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,746Updated this week
NVIDIA / TensorRT-LLM
View on GitHub
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…
☆14,170Updated this week
hiyouga / LlamaFactory
View on GitHub
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
☆73,422Updated this week
merrymercy / awesome-tensor-compilers
View on GitHub
A list of awesome compiler projects and papers for tensor computation and deep learning.
☆2,767Oct 19, 2024Updated last year
OpenPPL / ppq
View on GitHub
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
☆1,806Mar 28, 2024Updated 2 years ago
nndeploy / nndeploy
View on GitHub
一款简单易用和高性能的AI部署框架 | An Easy-to-Use and High-Performance AI Deployment Framework
☆1,848Apr 25, 2026Updated 2 months ago
datawhalechina / self-llm
View on GitHub
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程
☆31,367Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
NVIDIA / FasterTransformer
View on GitHub
Transformer related optimization, including BERT, GPT
☆6,442Mar 27, 2024Updated 2 years ago
krahets / hello-algo
View on GitHub
《Hello 算法》：动画图解、一键运行的数据结构与算法教程。支持简中、繁中、English、日本語，提供 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 等代码实现
☆128,659Updated this week
apache / tvm
View on GitHub
Open Machine Learning Compiler Framework
☆13,597Updated this week
Liu-xiandong / How_to_optimize_in_GPU
View on GitHub
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…
☆1,329Jul 29, 2023Updated 2 years ago
jingyaogong / minimind
View on GitHub
🧠「大模型」2小时完全从0训练64M的小参数LLM！Train a 64M-parameter LLM from scratch in just 2h!
☆53,680Jun 28, 2026Updated 3 weeks ago
ModelTC / LightLLM
View on GitHub
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…
☆4,183Updated this week
HuaizhengZhang / AI-Infra-from-Zero-to-Hero
View on GitHub
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Mod…
☆4,214Jul 25, 2025Updated 11 months ago