openvinotoolkit / npu_compiler
OpenVINO NPU Plugin
☆47Updated this week
Alternatives and similar repositories for npu_compiler:
Users that are interested in npu_compiler are comparing it to the libraries listed below
- OpenAI Triton backend for Intel® GPUs☆172Updated this week
- ☆61Updated 3 months ago
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆84Updated this week
- Library for modelling performance costs of different Neural Network workloads on NPU devices☆32Updated last week
- ☆60Updated last year
- ☆83Updated this week
- TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)☆40Updated 2 weeks ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆132Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆62Updated 3 weeks ago
- SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi☆38Updated 2 months ago
- oneAPI Collective Communications Library (oneCCL)☆227Updated this week
- ☆138Updated this week
- LLVM OpenCL C compiler suite for ventus GPGPU☆43Updated 2 weeks ago
- ☆37Updated this week
- Bandwidth test for ROCm☆54Updated 3 weeks ago
- Intel® Tensor Processing Primitives extension for Pytorch*☆12Updated 2 weeks ago
- Development repository for the Triton language and compiler☆114Updated this week
- RCCL Performance Benchmark Tests☆60Updated 3 weeks ago
- AMD's graph optimization engine.☆213Updated this week
- oneCCL Bindings for Pytorch*☆91Updated this week
- CUDA PTX-ISA Document 中文翻译版☆37Updated 2 weeks ago
- ROC profiler library. Profiling with perf-counters and derived metrics.☆138Updated this week
- Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…☆225Updated this week
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆373Updated this week
- ☆122Updated this week
- Intel® NPU (Neural Processing Unit) Driver☆237Updated this week
- A framework that support executing unmodified CUDA source code on non-NVIDIA devices.☆118Updated 3 months ago
- GPUDirect example☆59Updated 3 years ago
- oneAPI Level Zero Specification Headers and Loader☆241Updated this week
- ☆92Updated 11 months ago