halide / HalideLinks

a language for fast, portable data-parallel computation

☆6,139

Alternatives and similar repositories for Halide

Users that are interested in Halide are comparing it to the libraries listed below

Sorting:

arrayfire / arrayfire
ArrayFire: a general purpose GPU library.
☆4,745Updated last week
ermig1979 / Simd
C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, NEON for ARM.
☆2,185Updated last week
ROCm / hip
HIP: C++ Heterogeneous-Compute Interface for Portability
☆4,131Updated last week
uxlfoundation / oneDNN
oneAPI Deep Neural Network Library (oneDNN)
☆3,856Updated this week
pytorch / glow
Compiler for Neural Network hardware accelerators
☆3,310Updated last year
NVIDIA / thrust
[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
☆4,983Updated last year
tensorflow / mlir
"Multi-Level Intermediate Representation" Compiler Infrastructure
☆1,752Updated 4 years ago
ARM-software / ComputeLibrary
The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologi…
☆3,019Updated 2 weeks ago
apache / tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
☆12,492Updated this week
iree-org / iree
A retargetable MLIR-based machine learning compiler and runtime toolkit.
☆3,241Updated last week
tensor-compiler / taco
The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs
☆1,314Updated 3 months ago
NervanaSystems / ngraph
nGraph has moved to OpenVINO
☆1,347Updated 4 years ago
moderngpu / moderngpu
Patterns and behaviors for GPU computing
☆1,734Updated 3 years ago
ispc / ispc
Intel® Implicit SPMD Program Compiler
☆2,713Updated this week
NVIDIA / libcudacxx
[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl
☆2,307Updated last year
uxlfoundation / oneTBB
oneAPI Threading Building Blocks (oneTBB)
☆6,242Updated this week
google / XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
☆2,072Updated last week
google / gemmlowp
Low-precision matrix multiplication
☆1,812Updated last year
Maratyszcza / NNPACK
Acceleration package for neural networks on multi-core CPUs
☆1,692Updated last year
boostorg / compute
A C++ GPU Computing Library for OpenCL
☆1,619Updated 2 months ago
facebookresearch / TensorComprehensions
A domain specific language to express machine learning workloads.
☆1,760Updated 2 years ago
NVIDIA / cub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
☆1,765Updated last year
OpenMathLib / OpenBLAS
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
☆6,881Updated last week
google / benchmark
A microbenchmark support library
☆9,633Updated last week
tiny-dnn / tiny-dnn
header only, dependency-free deep learning framework in C++14
☆5,958Updated 3 years ago
KomputeProject / kompute
General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …
☆2,297Updated 3 weeks ago
xtensor-stack / xsimd
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
☆2,452Updated this week
xtensor-stack / xtensor
C++ tensors with broadcasting and lazy computing
☆3,567Updated 3 weeks ago
flame / how-to-optimize-gemm
☆1,902Updated 2 years ago
simd-everywhere / simde
Implementations of SIMD instruction sets for systems which don't natively support them.
☆2,750Updated this week