halide / Halide
a language for fast, portable data-parallel computation
☆5,906Updated this week
Related projects ⓘ
Alternatives and complementary repositories for Halide
- ArrayFire: a general purpose GPU library.☆4,567Updated 2 weeks ago
- Compiler for Neural Network hardware accelerators☆3,236Updated 6 months ago
- [ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl☆4,924Updated 9 months ago
- Intel® Implicit SPMD Program Compiler☆2,520Updated this week
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆11,798Updated this week
- [ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl☆2,294Updated 9 months ago
- A retargetable MLIR-based machine learning compiler and runtime toolkit.☆2,846Updated this week
- oneAPI Deep Neural Network Library (oneDNN)☆3,635Updated this week
- High-efficiency floating-point neural network inference operators for mobile, server, and Web☆1,885Updated this week
- "Multi-Level Intermediate Representation" Compiler Infrastructure☆1,737Updated 3 years ago
- Conan - The open-source C and C++ package manager☆8,285Updated this week
- The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologi…☆2,861Updated this week
- header only, dependency-free deep learning framework in C++14☆5,859Updated 2 years ago
- C/C++ Performance Profiler☆4,224Updated last week
- C++ Library Manager for Windows, Linux, and MacOS☆23,288Updated this week
- A domain specific language to express machine learning workloads.☆1,761Updated last year
- C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, NEON for ARM.☆2,070Updated this week
- C++ tensors with broadcasting and lazy computing☆3,363Updated 3 months ago
- OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.☆6,401Updated this week
- HIP: C++ Heterogeneous-Compute Interface for Portability☆3,763Updated this week
- oneAPI Threading Building Blocks (oneTBB)☆5,732Updated this week
- The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs☆1,256Updated 7 months ago
- mlpack: a fast, header-only C++ machine learning library☆5,118Updated this week
- Open standard for machine learning interoperability☆17,949Updated this week
- C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))☆2,211Updated last week
- Low-precision matrix multiplication☆1,780Updated 9 months ago
- A C++ standalone library for machine learning☆5,288Updated this week
- mimalloc is a compact general purpose allocator with excellent performance.☆10,595Updated this week
- Patterns and behaviors for GPU computing☆1,667Updated 2 years ago
- C++ implementation of the Python Numpy library☆3,581Updated last month