chenzomi12 / DeepLearningSystem
AI Infra主要是指AI的基础建设,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术。
☆204Updated 9 months ago
Alternatives and similar repositories for DeepLearningSystem:
Users that are interested in DeepLearningSystem are comparing it to the libraries listed below
- ☆592Updated 5 months ago
- llm theoretical performance analysis tools and support params, flops, memory and latency analysis.☆75Updated 2 weeks ago
- [ICCV 2023] RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers☆119Updated last year
- FlagScale is a large model toolkit based on open-sourced projects.☆208Updated last week
- ☆300Updated 6 months ago
- ☆308Updated this week
- Triton Documentation in Chinese Simplified / Triton 中文文档☆52Updated last week
- learning how CUDA works☆189Updated 5 months ago
- Tutorials for writing high-performance GPU operators in AI frameworks.☆126Updated last year
- 金融财报问答大模型LLM☆197Updated 10 months ago
- A tutorial for CUDA&PyTorch☆126Updated 2 months ago
- [ICCV 2023] I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference☆161Updated 4 months ago
- ☆76Updated last year
- RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions☆42Updated last week
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆213Updated this week
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆88Updated 10 months ago
- LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.☆393Updated last week
- LLM101n: Let's build a Storyteller 中文版☆121Updated 5 months ago
- Inference code for LLaMA models☆114Updated last year
- GLake: optimizing GPU memory management and IO transmission.☆419Updated last month
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆468Updated 10 months ago
- AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and ver…☆216Updated last week
- ☆140Updated 8 months ago
- how to learn PyTorch and OneFlow☆382Updated 9 months ago
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆265Updated last week
- ☆597Updated 7 months ago
- ☆235Updated last month
- ☆62Updated 2 months ago
- A light llama-like llm inference framework based on the triton kernel.☆78Updated last week
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆23Updated last week