intel / intel-npu-acceleration-library
Intel® NPU Acceleration Library
☆578Updated this week
Alternatives and similar repositories for intel-npu-acceleration-library:
Users that are interested in intel-npu-acceleration-library are comparing it to the libraries listed below
- ☆447Updated last month
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆198Updated this week
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆1,694Updated this week
- Intel® Extension for TensorFlow*☆329Updated this week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆430Updated this week
- Intel® NPU (Neural Processing Unit) Driver☆213Updated 3 weeks ago
- Generative AI extensions for onnxruntime☆581Updated this week
- OpenVINO NPU Plugin☆41Updated last month
- Low-bit LLM inference on CPU with lookup table☆646Updated last week
- HIPIFY: Convert CUDA to Portable C++ Code☆537Updated this week
- OpenAI Triton backend for Intel® GPUs☆154Updated this week
- TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillati…☆667Updated last week
- Tenstorrent TT-BUDA Repository☆274Updated 3 weeks ago
- TT-NN operator library, and TT-Metalium low level kernel programming model.☆587Updated this week
- BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.☆496Updated this week
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆334Updated this week
- ☆99Updated 2 months ago
- ☆241Updated this week
- AMD's graph optimization engine.☆196Updated this week
- The no-code AI toolchain☆80Updated this week
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆304Updated this week
- FlashInfer: Kernel Library for LLM Serving☆1,797Updated this week
- An innovative library for efficient LLM inference via low-bit quantization☆352Updated 4 months ago
- cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it☆484Updated 3 weeks ago
- CUDA Python: Performance meets Productivity☆1,045Updated this week
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆330Updated 3 weeks ago
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆72Updated this week
- Shared Middle-Layer for Triton Compilation☆220Updated this week
- Neural Network Compression Framework for enhanced OpenVINO™ inference☆960Updated this week
- CUDA Kernel Benchmarking Library☆547Updated last month