ForceInjection / AI-fundermentals
AI 基础知识 - GPU 架构、CUDA 编程以及大模型基础知识
☆83Updated this week
Alternatives and similar repositories for AI-fundermentals:
Users that are interested in AI-fundermentals are comparing it to the libraries listed below
- GLake: optimizing GPU memory management and IO transmission.☆435Updated 3 months ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆92Updated 11 months ago
- how to learn PyTorch and OneFlow☆402Updated 11 months ago
- ☆319Updated last month
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆122Updated 3 years ago
- Efficient and easy multi-instance LLM serving☆319Updated this week
- Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…☆230Updated this week
- A model compilation solution for various hardware☆409Updated this week
- A self-learning tutorail for CUDA High Performance Programing.☆418Updated this week
- Triton Documentation in Chinese Simplified / Triton 中文文档☆57Updated last month
- FlagScale is a large model toolkit based on open-sourced projects.☆244Updated this week
- HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container☆138Updated last week
- ☆226Updated 3 weeks ago
- 先进编译实验室的个人主页☆43Updated last month
- Paper Reading:涉及分布式、虚拟化、网络、机器学习☆23Updated 4 years ago
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆286Updated this week
- Summary of some awesome work for optimizing LLM inference☆58Updated last month
- Device-plugin for volcano vgpu which support hard resource isolation☆64Updated this week
- Tutorials for writing high-performance GPU operators in AI frameworks.☆129Updated last year
- The IX device plugin is a DaemonSet for Kubernetes, which can help to expose the Iluvatar GPU in the Kubernetes cluster.☆12Updated 2 months ago
- Hooked CUDA-related dynamic libraries by using automated code generation tools.☆149Updated last year
- The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)☆198Updated 2 months ago
- Fast OS-level support for GPU checkpoint and restore☆165Updated this week
- Disaggregated serving system for Large Language Models (LLMs).☆483Updated 6 months ago
- Automatic tuning for ML model deployment on Kubernetes☆81Updated 4 months ago
- PyTorch distributed training acceleration framework☆44Updated 3 weeks ago
- Kubernetes Operator for AI and Bigdata Elastic Training☆85Updated last month
- CUDA 算子手撕与面试指南☆185Updated last month
- Materials for learning SGLang☆314Updated last week
- A tutorial for CUDA&PyTorch☆127Updated last month