NVlabs / parrotLinks
Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.
☆235Updated last week
Alternatives and similar repositories for parrot
Users that are interested in parrot are comparing it to the libraries listed below
Sorting:
- The project provides high-performance concurrency, enabling highly parallel computation.☆227Updated last month
- Powerful automatic differentiation in C++ and Python☆386Updated last week
- FastAD is a C++ implementation of automatic differentiation both forward and reverse mode.☆116Updated 2 years ago
- clad -- automatic differentiation for C/C++☆379Updated last week
- CUDA kernel author's tools☆113Updated 3 years ago
- Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal, and Rust - all it takes to sum a lot of numbers fast!☆112Updated 4 months ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆52Updated this week
- ☆150Updated last year
- A graph library using modern C++ features (e.g., C++20 ranges) to be as efficient and user-friendly as possible.☆51Updated 2 weeks ago
- C++ template library for probabilistic programming☆51Updated 5 years ago
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆25Updated this week
- Omnitrace: Application Profiling, Tracing, and Analysis☆335Updated last week
- A fast implementation of log() and exp()☆53Updated 2 years ago
- C++ library for ODE integration via Taylor's method and LLVM☆230Updated last week
- Minimal Rust-inspired C++20 STL replacement☆203Updated 11 months ago
- Counter-based random number generators for C, C++ and CUDA.☆112Updated last year
- Abstraction Library for Parallel Kernel Acceleration☆395Updated this week
- Agenium Scale vectorization library for CPUs and GPUs☆334Updated 4 years ago
- A Clang-based C++ Interoperability Library☆86Updated 3 weeks ago
- Bazel C++ Pybind11 Sample☆13Updated this week
- ☆70Updated 2 months ago
- A highly optimised C++ library for mathematical applications and neural networks.☆177Updated 3 months ago
- A template library for headless rendering of Signed Distance Fields based on OpenMP.☆25Updated 5 months ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- High-level C++ for Accelerator Clusters☆153Updated last month
- L3: Lightweight Logging Library. A very small 'C' library to generate low-footprint, non-intrusive, high-performance logging of trace me…☆156Updated 6 months ago
- A nanobind example project☆114Updated this week
- Experimental ranges for CUDA☆25Updated 6 years ago
- C++ template metaprogram driven tensor math library☆90Updated 2 months ago
- pika is a C++ tasking library built on std::execution with fibers, CUDA, HIP, and MPI support.☆79Updated last week