intel / intel-extension-for-openxlaLinks

☆51

Alternatives and similar repositories for intel-extension-for-openxla

Users that are interested in intel-extension-for-openxla are comparing it to the libraries listed below

Sorting:

intel / torch-xpu-ops
☆61Updated this week
ROCm / aotriton
Ahead of Time (AOT) Triton Math Library
☆83Updated this week
NVIDIA / Fuser
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
☆362Updated this week
intel / intel-xpu-backend-for-triton
OpenAI Triton backend for Intel® GPUs
☆218Updated this week
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆63Updated 4 months ago
intel / torch-ccl
oneCCL Bindings for Pytorch* (deprecated)
☆102Updated last week
ROCm / iris
AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming
☆111Updated this week
openxla / community
Stores documents and resources used by the OpenXLA developer community
☆131Updated last year
openxla / shardy
MLIR-based partitioning system
☆148Updated this week
iree-org / iree-nvgpu
☆50Updated last year
jax-ml / ml_dtypes
A stand-alone implementation of several NumPy dtype extensions used in machine learning.
☆306Updated 2 weeks ago
ROCm / triton
Development repository for the Triton language and compiler
☆137Updated this week
meta-pytorch / triton-cpu
An experimental CPU backend for Triton (https//github.com/openai/triton)
☆47Updated 2 months ago
ROCm / hipBLAS
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆147Updated this week
ROCm / hipBLASLt
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆114Updated this week
meta-pytorch / tlparse
TORCH_LOGS parser for PT2
☆64Updated last week
intel / sycl-tla
SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs
☆50Updated this week
ROCm / tritonBLAS
A lightweight triton-based General Matrix Multiplication (GEMM) library.
☆27Updated this week
intel / xetla
☆62Updated 11 months ago
salykova / sgemm.cu
High-Performance SGEMM on CUDA devices
☆110Updated 9 months ago
ROCm / TransformerEngine
☆51Updated this week
cchan / tccl
extensible collectives library in triton
☆91Updated 7 months ago
ROCm / Tensile
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆254Updated last week
NVIDIA / jaxpp
JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training
☆57Updated last month
microsoft / TileFusion
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
☆100Updated 4 months ago
NVIDIA / numba-cuda
The CUDA target for Numba
☆210Updated this week
north-numerical-computing / tensor-cores-numerical-behavior
Test suite for probing the numerical behavior of NVIDIA tensor cores
☆41Updated last year
iree-org / iree-jax
☆53Updated last year
ROCm / rocMLIR
☆157Updated this week
ROCm / pyrsmi
python package of rocm-smi-lib
☆24Updated 4 months ago