nnperfwins / nnPerfLinks
☆13Updated last year
Alternatives and similar repositories for nnPerf
Users that are interested in nnPerf are comparing it to the libraries listed below
Sorting:
- ☆120Updated this week
- ☆78Updated 2 years ago
- hands on model tuning with TVM and profile it on a Mac M1, x86 CPU, and GTX-1080 GPU.☆49Updated 2 years ago
- Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…☆283Updated 11 months ago
- llm theoretical performance analysis tools and support params, flops, memory and latency analysis.☆115Updated 7 months ago
- To deploy Transformer models in CV to mobile devices.☆18Updated 4 years ago
- Summary of some awesome work for optimizing LLM inference☆173Updated 2 months ago
- The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]☆19Updated 3 years ago
- paper and its code for AI System☆347Updated 2 months ago
- This is a list of awesome edgeAI inference related papers.☆99Updated 2 years ago
- ☆102Updated 2 years ago
- 使用 cutlass 实现 flash-attention 精简版,具有教学意义☆56Updated last year
- AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and ver…☆298Updated this week
- ☆222Updated last year
- A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.☆363Updated last year
- Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…☆617Updated last year
- Tutorials for writing high-performance GPU operators in AI frameworks.☆136Updated 2 years ago
- A curated list of awesome projects and papers for distributed training or inference☆265Updated last year
- This repository is established to store personal notes and annotated papers during daily research.☆180Updated 3 weeks ago
- Curated collection of papers in machine learning systems☆507Updated this week
- LLM serving cluster simulator☆135Updated last year
- Multi-Instance-GPU profiling tool☆58Updated 2 years ago
- ☆212Updated 2 years ago
- Repo for SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting (ISCA25)☆70Updated 9 months ago
- A baseline repository of Auto-Parallelism in Training Neural Networks☆147Updated 3 years ago
- LLM Inference analyzer for different hardware platforms☆100Updated 2 months ago
- ☆628Updated 3 weeks ago
- mperf是一个面向移动/嵌入式平台的算子性能调优工具箱☆192Updated 2 years ago
- ☆19Updated 3 years ago
- Model-less Inference Serving☆94Updated 2 years ago