ForceInjection / AI-fundermentals
AI 基础知识 - GPU 架构、CUDA 编程以及大模型基础知识
☆58Updated last month
Related projects ⓘ
Alternatives and complementary repositories for AI-fundermentals
- HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container☆105Updated last month
- ☆214Updated this week
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆120Updated 2 years ago
- GLake: optimizing GPU memory management and IO transmission.☆379Updated 3 months ago
- ☆55Updated 4 years ago
- Kubernetes Operator for AI and Bigdata Elastic Training☆84Updated 3 months ago
- Device-plugin for volcano vgpu which support hard resource isolation☆48Updated 2 weeks ago
- Hooked CUDA-related dynamic libraries by using automated code generation tools.☆139Updated 11 months ago
- Efficient and easy multi-instance LLM serving☆216Updated this week
- Automatic tuning for ML model deployment on Kubernetes☆80Updated 3 weeks ago
- ☆48Updated last month
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆75Updated 8 months ago
- Using CRDs to manage GPU resources in Kubernetes.☆191Updated 2 years ago
- ☆129Updated 3 years ago
- TePDist (TEnsor Program DISTributed) is an HLO-level automatic distributed system for DL models.☆90Updated last year
- Yoda is a kubernetes scheduler based on GPU metrics. Yoda是一个基于GPU参数指标的 Kubernetes 调度器☆140Updated 2 years ago
- NVIDIA NCCL Tests for Distributed Training☆70Updated 2 weeks ago
- ☆504Updated 5 months ago
- Device plugins for Volcano, e.g. GPU☆105Updated 2 months ago
- High performance Transformer implementation in C++.☆82Updated 2 months ago
- ☆198Updated 3 weeks ago
- A low-latency & high-throughput serving engine for LLMs☆245Updated 2 months ago
- 阅读笔记☆23Updated 5 years ago
- ☆269Updated last year
- ☆31Updated 3 years ago
- Summary of some awesome work for optimizing LLM inference☆37Updated 2 weeks ago
- An interference-aware scheduler for fine-grained GPU sharing☆111Updated 6 months ago
- GPU-scheduler-for-deep-learning☆200Updated 4 years ago
- Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“☆57Updated 5 months ago
- Paper Reading:涉及分布式、虚拟化、网络、机器学习☆22Updated 4 years ago