ForceInjection / AI-fundermentals
AI 基础知识 - GPU 架构、CUDA 编程以及大模型基础知识
☆120Updated last week
Alternatives and similar repositories for AI-fundermentals:
Users that are interested in AI-fundermentals are comparing it to the libraries listed below
- FlagScale is a large model toolkit based on open-sourced projects.☆268Updated this week
- GLake: optimizing GPU memory management and IO transmission.☆456Updated last month
- Fast OS-level support for GPU checkpoint and restore☆183Updated last week
- ☆129Updated 2 months ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆97Updated last year
- A self-learning tutorail for CUDA High Performance Programing.☆599Updated last week
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆347Updated this week
- Efficient and easy multi-instance LLM serving☆383Updated this week
- Materials for learning SGLang☆387Updated last month
- 一个手把手教你从零开始编写GPT并训练大语言模型的教程☆71Updated 3 months ago
- how to learn PyTorch and OneFlow☆425Updated last year
- ☆326Updated 3 months ago
- KV cache store for distributed LLM inference☆151Updated 3 weeks ago
- A model compilation solution for various hardware☆427Updated this week
- 先进编译实验室的个人主页☆66Updated 3 months ago
- Free resource for the book AI Compiler Development Guide☆43Updated 2 years ago
- Triton Documentation in Chinese Simplified / Triton 中文文档☆66Updated last week
- LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.☆732Updated this week
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆331Updated 3 weeks ago
- Codes & examples for "CUDA - From Correctness to Performance"☆96Updated 6 months ago
- ☆235Updated 2 months ago
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆124Updated 3 years ago
- ☆27Updated 3 months ago
- Summary of some awesome work for optimizing LLM inference☆69Updated last week
- UltraScale Playbook 中文版☆35Updated last month
- llm theoretical performance analysis tools and support params, flops, memory and latency analysis.☆85Updated 3 months ago
- LLM Inference benchmark☆408Updated 9 months ago
- 一种任务级GPU算力分时调度的高性能深度学习训练平台☆634Updated last year
- ☆23Updated last month
- A light llama-like llm inference framework based on the triton kernel.☆108Updated last week