eyalroz / gpu-kernel-runner
Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line
☆18Updated this week
Related projects ⓘ
Alternatives and complementary repositories for gpu-kernel-runner
- Experimental ranges for CUDA☆25Updated 5 years ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆43Updated 10 months ago
- Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!☆73Updated 6 months ago
- Library for length agnostic SIMD intrinsic support and the corresponding math operations☆20Updated 3 years ago
- Reference implementation of the draft C++ GraphBLAS specification.☆28Updated 9 months ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆80Updated this week
- WIP · CUDA compatibility for Blaze · https://bitbucket.org/blaze-lib/blaze☆17Updated 5 years ago
- A simple, but fast, triangular solver☆17Updated 3 years ago
- pika is a C++ tasking library built on std::execution with fibers, CUDA, HIP, and MPI support.☆64Updated this week
- Directed Acyclic Graph Execution Engine (DAGEE) is a C++ library that enables programmers to express computation and data movement, as ta…☆44Updated 3 years ago
- CUDA executors☆14Updated 3 years ago
- CUDA kernel author's tools☆109Updated 2 years ago
- Parallel Tasking Library (PTL) - Lightweight C++11 mutilthreading tasking system featuring thread-pool, task-groups, and lock-free task q…☆43Updated last week
- A C++ allocator based on cudaMallocManaged☆23Updated 6 years ago
- Boost.org graph_parallel module☆27Updated this week
- generic C++ containers; matrix, triangle matrix, crs sparse matrix, etc.☆12Updated 6 years ago
- An implementation of HIP that works on CPUs, across OSes.☆112Updated 8 months ago
- 3D Tensors for Blaze (https://bitbucket.org/blaze-lib/blaze)☆36Updated 4 years ago
- ☆68Updated 4 years ago
- Department of Energy Standard Utility Library☆30Updated 2 months ago
- Codeplay project for contributions to the LLVM SYCL implementation☆30Updated 3 years ago
- ☆28Updated 2 weeks ago
- mallocMC: Memory Allocator for Many Core Architectures☆51Updated last week
- An alternative to Boost.MPI for a user friendly C++ interface for MPI (MPICH).☆19Updated 6 years ago
- C++20 and onward collection of high performance data containers and related tools☆51Updated last month
- A unified framework across multiple programming platforms☆33Updated 5 months ago
- SYCL Conformance Tests☆62Updated this week
- DLA-Future☆65Updated this week
- Distributed View Extension for Kokkos☆43Updated 2 months ago