amd / ZenDNN-pytorchLinks

☆9

Alternatives and similar repositories for ZenDNN-pytorch

Users that are interested in ZenDNN-pytorch are comparing it to the libraries listed below

Sorting:

amd / ZenDNN
☆106Updated last month
ROCm / rocAL
The AMD rocAL is designed to efficiently decode and process images and videos from a variety of storage formats and modify them through a…
☆17Updated this week
amd / ZenDNN-onnxruntime
☆11Updated 6 months ago
ROCm / half
☆19Updated 2 weeks ago
amd / ryzen-ai-documentation
Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…
☆64Updated this week
intel / onnxruntime
ONNX Runtime: cross-platform, high performance scoring engine for ML models
☆65Updated this week
ROCm / HIPCC
HIPCC: HIP compiler driver
☆40Updated last year
openvinotoolkit / npu_compiler
OpenVINO Intel NPU Compiler
☆56Updated this week
ROCm / rocm-install-on-linux
☆20Updated this week
ROCm / roctracer
ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs
☆83Updated last week
ROCm / AMDMIGraphX
AMD's graph optimization engine.
☆220Updated this week
intel / intel-xpu-backend-for-triton
OpenAI Triton backend for Intel® GPUs
☆189Updated this week
HabanaAI / SynapseAI_Core
SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi
☆41Updated 4 months ago
NVIDIA / GPUStressTest
GPU Stress Test is a tool to stress the compute engine of NVIDIA Tesla GPU’s by running a BLAS matrix multiply using different data types…
☆94Updated last month
Arm-China / Compass_Optimizer
Compass Optimizer (OPT for short), is part of the Zhouyi Compass Neural Network Compiler. The OPT is designed for converting the float In…
☆29Updated last week
ROCm / vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆79Updated this week
ROCm / aiter
AI Tensor Engine for ROCm
☆201Updated this week
ROCm / rocm_bandwidth_test
Bandwidth test for ROCm
☆56Updated 2 weeks ago
ROCm / rocm-examples
A collection of examples for the ROCm software stack
☆216Updated this week
MyCaffe / NCCL
Windows version of NVIDIA's NCCL ('Nickel') for multi-GPU training - please use https://github.com/NVIDIA/nccl for changes.
☆59Updated last year
intel / torch-xpu-ops
☆46Updated this week
ROCm / hipSPARSELt
☆11Updated this week
ROCm / rocDecode
rocDecode is a high performance video decode SDK for AMD hardware
☆26Updated this week
Xilinx / inference-server
☆45Updated 11 months ago
amd / UIF
☆60Updated last year
ROCm / Tensile
Stretching GPU performance for GEMMs and tensor contractions.
☆242Updated last week
intel / npu-nn-cost-model
Library for modelling performance costs of different Neural Network workloads on NPU devices
☆34Updated last month
ROCm / clr
☆136Updated this week
GaoYusong / llm.cpp
A C++ port of karpathy/llm.c features a tiny torch library while maintaining overall simplicity.
☆33Updated 10 months ago
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆61Updated 3 months ago