NVlabs / parrotLinks
Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.
☆240Updated this week
Alternatives and similar repositories for parrot
Users that are interested in parrot are comparing it to the libraries listed below
Sorting:
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆53Updated this week
- CUDA kernel author's tools☆115Updated 3 years ago
- Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal, and Rust - all it takes to sum a lot of numbers fast!☆113Updated 5 months ago
- The project provides high-performance concurrency, enabling highly parallel computation.☆230Updated last month
- ☆149Updated last year
- clad -- automatic differentiation for C/C++☆378Updated 2 weeks ago
- Agenium Scale vectorization library for CPUs and GPUs☆336Updated 4 years ago
- Powerful automatic differentiation in C++ and Python☆389Updated this week
- A fast implementation of log() and exp()☆55Updated 3 years ago
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆24Updated 3 weeks ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- A nanobind example project☆115Updated this week
- pika is a C++ tasking library built on std::execution with fibers, CUDA, HIP, and MPI support.☆79Updated last week
- C++20 Tensor library☆33Updated this week
- C++ template library for probabilistic programming☆51Updated 5 years ago
- Abstraction Library for Parallel Kernel Acceleration☆397Updated last week
- FastAD is a C++ implementation of automatic differentiation both forward and reverse mode.☆118Updated 2 years ago
- High-level C++ for Accelerator Clusters☆153Updated 3 weeks ago
- An implementation of HIP that works on CPUs, across OSes.☆131Updated last year
- A Low-Level Abstraction of Memory Access☆92Updated last year
- Minimal Rust-inspired C++20 STL replacement☆208Updated 11 months ago
- C++ template metaprogram driven tensor math library☆90Updated 2 months ago
- Exploring using stdpar and Cython☆34Updated 5 years ago
- A graph library using modern C++ features (e.g., C++20 ranges) to be as efficient and user-friendly as possible.☆52Updated last month
- Atomistic Spin Simulation Framework☆66Updated 5 years ago
- A highly optimised C++ library for mathematical applications and neural networks.☆177Updated 4 months ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆260Updated 11 months ago
- Counter-based random number generators for C, C++ and CUDA.☆112Updated last year
- Omnitrace: Application Profiling, Tracing, and Analysis☆335Updated last week
- Reference implementation of the draft C++ GraphBLAS specification.☆32Updated 10 months ago