intel / intel-npu-acceleration-library
Intel® NPU Acceleration Library
☆671Updated 3 weeks ago
Alternatives and similar repositories for intel-npu-acceleration-library
Users that are interested in intel-npu-acceleration-library are comparing it to the libraries listed below
Sorting:
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆274Updated this week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆464Updated this week
- Intel® NPU (Neural Processing Unit) Driver☆254Updated this week
- ☆514Updated last week
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆1,845Updated this week
- OpenAI Triton backend for Intel® GPUs☆184Updated this week
- A collection of examples for the ROCm software stack☆208Updated this week
- Intel® Extension for TensorFlow*☆338Updated last month
- Generative AI extensions for onnxruntime☆710Updated this week
- Low-bit LLM inference on CPU with lookup table☆770Updated 3 weeks ago
- A curated list of OpenVINO based AI projects☆132Updated 4 months ago
- DLPrimitives/OpenCL out of tree backend for pytorch☆346Updated 8 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆350Updated 8 months ago
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆395Updated this week
- ☆108Updated 3 weeks ago
- OpenVINO Intel NPU Compiler☆50Updated this week
- Local LLM Server with NPU Acceleration☆180Updated last week
- ☆251Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆61Updated 2 months ago
- SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX R…☆2,402Updated this week
- Tenstorrent TT-BUDA Repository☆312Updated last month
- AI Tensor Engine for ROCm☆195Updated this week
- Tools for easier OpenVINO development/debugging☆9Updated last month
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆349Updated this week
- Deep Learning Primitives and Mini-Framework for OpenCL☆196Updated 8 months ago
- AMD's graph optimization engine.☆217Updated this week
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆384Updated this week
- collection of benchmarks to measure basic GPU capabilities☆370Updated 3 months ago
- BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.☆612Updated last week
- llama.cpp fork with additional SOTA quants and improved performance☆439Updated this week