intel / intel-npu-acceleration-library
Intel® NPU Acceleration Library
☆658Updated 2 months ago
Alternatives and similar repositories for intel-npu-acceleration-library:
Users that are interested in intel-npu-acceleration-library are comparing it to the libraries listed below
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆249Updated this week
- ☆495Updated this week
- Intel® NPU (Neural Processing Unit) Driver☆237Updated this week
- OpenVINO NPU Plugin☆48Updated this week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆456Updated this week
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆1,812Updated this week
- Generative AI extensions for onnxruntime☆667Updated this week
- Low-bit LLM inference on CPU with lookup table☆705Updated 2 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆351Updated 7 months ago
- BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.☆572Updated last month
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆329Updated this week
- OpenAI Triton backend for Intel® GPUs☆172Updated this week
- Advanced Quantization Algorithm for LLMs/VLMs.☆413Updated last week
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,166Updated 5 months ago
- ☆104Updated 3 weeks ago
- cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it☆534Updated 2 weeks ago
- ☆410Updated last week
- Intel® Extension for TensorFlow*☆336Updated 2 weeks ago
- FlashInfer: Kernel Library for LLM Serving☆2,532Updated this week
- LLM SDK for OnnxRuntime GenAI (OGA)☆119Updated this week
- OpenVINO Tokenizers extension☆31Updated this week
- A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, et…☆832Updated 2 weeks ago
- AI Tensor Engine for ROCm☆142Updated this week
- HIPIFY: Convert CUDA to Portable C++ Code☆567Updated last week
- ☆157Updated this week
- ☆106Updated 3 weeks ago
- Machine learning compiler based on MLIR for Sophgo TPU.☆700Updated this week
- Library for modelling performance costs of different Neural Network workloads on NPU devices☆32Updated last week
- LiteRT is the new name for TensorFlow Lite (TFLite). While the name is new, it's still the same trusted, high-performance runtime for on-…☆330Updated this week
- ☆250Updated this week