cr7258 / ai-infra-learningView external linksLinks
This repository organizes materials, recordings, and schedules related to AI-infra learning meetings.
☆325Feb 8, 2026Updated last week
Alternatives and similar repositories for ai-infra-learning
Users that are interested in ai-infra-learning are comparing it to the libraries listed below
Sorting:
- ViTALiTy (HPCA'23) Code Repository☆23Mar 13, 2023Updated 2 years ago
- An efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences☆31Mar 7, 2024Updated last year
- An LLM Mock Server that supports simulating the protocols of all LLM providers.☆11Oct 18, 2025Updated 3 months ago
- A Model Context Protocol (MCP) server implementation that enables comprehensive configuration and management of Higress.☆22Mar 29, 2025Updated 10 months ago
- SGLang is a fast serving framework for large language models and vision language models.☆18Updated this week
- LLM training parallelisms (DP, FSDP, TP, PP) in pure C☆26Jan 27, 2026Updated 2 weeks ago
- 📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉☆9,666Updated this week
- ☆49Apr 15, 2024Updated last year
- Converted the training data of OpenVLA into general form of multimodal training instructions and then used with LLaVA-OneVision☆23Jan 12, 2025Updated last year
- A light weight vLLM simulator, for mocking out replicas.☆85Updated this week
- This project is designed to simulate GPU information, making it easier to test scenarios where a GPU is not available.☆64Jan 9, 2026Updated last month
- 🎉 An awesome & curated list of best LLMOps tools.☆195Feb 4, 2026Updated last week
- Puzzles for learning Triton, play it with minimal environment configuration!☆624Dec 28, 2025Updated last month
- how to optimize some algorithm in cuda.☆2,819Updated this week
- ☆11Jun 3, 2025Updated 8 months ago
- tutorial for writing custom pytorch cpp+cuda kernel, applied on volume rendering (NeRF)☆29Dec 12, 2023Updated 2 years ago
- A self-learning tutorail for CUDA High Performance Programing.☆890Jan 14, 2026Updated last month
- ☆67Nov 21, 2024Updated last year
- AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。☆6,039Dec 22, 2025Updated last month
- A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.☆3,443Updated this week
- A High-Performance LLM Inference Engine with vLLM-Style Continuous Batching☆92Jan 2, 2026Updated last month
- Hands-On Practical MLIR Tutorial☆719Oct 20, 2023Updated 2 years ago
- Nano vLLM☆11,617Nov 3, 2025Updated 3 months ago
- learning how CUDA works☆375Mar 3, 2025Updated 11 months ago
- This project is about convolution operator optimization on GPU, include GEMM based (Implicit GEMM) convolution.☆43Sep 29, 2025Updated 4 months ago
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆494Oct 28, 2025Updated 3 months ago
- Learn how to create impactful AI Agents using Agno AI Python Package☆13Jul 31, 2025Updated 6 months ago
- Automatic tree delineation from LiDAR point couds☆12Jun 27, 2018Updated 7 years ago
- Extending BookSim2.0 and HotSpot6.0 for Power, Performance and Thermal evaluation of 3D NoC Architectures☆12Aug 9, 2019Updated 6 years ago
- ☆32Dec 10, 2025Updated 2 months ago
- DoctorRAG is a medical AI that mimics doctor-like reasoning by combining textbook knowledge with insights from similar patient cases, usi…☆15May 21, 2025Updated 8 months ago
- MutatingWebhookConfiguration based on k8s. Implementing node overselling☆12Aug 18, 2023Updated 2 years ago
- Implement some method of LLM KV Cache Sparsity☆41Jun 6, 2024Updated last year
- Stateful LLM Serving☆95Mar 11, 2025Updated 11 months ago
- A series of RISC-V soft core processor written from scratch. Now, we're using all open-source toolchain (chisel, mill, verilator, NEMU, …☆46Nov 8, 2023Updated 2 years ago
- ☆10Jan 8, 2025Updated last year
- Coarse Grained Reconfigurable Arrays with Chisel3☆13Jul 1, 2024Updated last year
- Codes for our paper "Exploring Bit-Slice Sparsity in Deep Neural Networks for Efficient ReRAM-Based Deployment" [NeurIPS'19 EMC2 workshop]…☆10Oct 12, 2020Updated 5 years ago
- ☆11Jun 11, 2021Updated 4 years ago