intel / intel-npu-acceleration-library
Intel® NPU Acceleration Library
☆507Updated this week
Related projects ⓘ
Alternatives and complementary repositories for intel-npu-acceleration-library
- ☆399Updated this week
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆152Updated this week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆409Updated this week
- ☆231Updated this week
- OpenAI Triton backend for Intel® GPUs☆143Updated this week
- Intel® Extension for TensorFlow*☆318Updated last month
- OpenVINO NPU Plugin☆37Updated 2 weeks ago
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆286Updated this week
- Intel® NPU (Neural Processing Unit) Driver☆181Updated this week
- Generative AI extensions for onnxruntime☆514Updated this week
- An innovative library for efficient LLM inference via low-bit quantization☆348Updated 2 months ago
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆313Updated this week
- cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it☆455Updated 3 weeks ago
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆1,624Updated this week
- Low-bit LLM inference on CPU with lookup table☆583Updated this week
- BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.☆420Updated this week
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆371Updated this week
- TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillati…☆567Updated this week
- OpenVINO Tokenizers extension☆25Updated this week
- Common utilities for ONNX converters☆251Updated 5 months ago
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆57Updated 2 months ago
- An Awesome list of oneAPI projects☆126Updated 3 months ago
- Tenstorrent TT-BUDA Repository☆225Updated last month
- Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure☆767Updated this week
- FlashInfer: Kernel Library for LLM Serving☆1,452Updated this week
- A collection of examples for the ROCm software stack☆167Updated this week
- HIPIFY: Convert CUDA to Portable C++ Code☆523Updated this week
- DLPrimitives/OpenCL out of tree backend for pytorch☆287Updated 2 months ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆270Updated this week
- ☆1,022Updated 10 months ago