lattice-land / cuda-battery
Abstractions of memory, allocator, vector, tuple, shared_ptr, unique_ptr, bitset, variant and string working on both CPU and GPU
☆30Updated 2 weeks ago
Alternatives and similar repositories for cuda-battery:
Users that are interested in cuda-battery are comparing it to the libraries listed below
- Parallel Tasking Library (PTL) - Lightweight C++11 mutilthreading tasking system featuring thread-pool, task-groups, and lock-free task q…☆45Updated 5 months ago
- vectorization of the kd-tree data structure and search algorithm☆40Updated 7 years ago
- High-Performance Computing: CPU Instructions, GPU OpenCL & CUDA, etc.☆14Updated 11 months ago
- An expression template based linear algebra library running completely on the GPU using CUDA☆25Updated 3 years ago
- SuiteSparse: a suite of sparse matrix packages by @DrTimothyAldenDavis et al. with native CMake support☆53Updated 9 months ago
- Light and self-contained implementation of C++17 parallel algorithms.☆34Updated 5 months ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆50Updated last month
- CUDA implementation of parallel Depth First Search (DFS) algorithm and it's comparison with a serial C++ DFS implementation.☆29Updated 6 years ago
- GPU B-Tree with support for versioning (snapshots).☆47Updated 6 months ago
- 小彭老师推出 SyCL 2020 课程(施工中,日后会在直播中放出)☆15Updated last year
- Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceler…☆29Updated 9 months ago
- A C++ neural network library for machine learning☆14Updated 11 months ago
- a CUDA implementation of a priority queue☆84Updated 4 years ago
- Some CUDA design patterns and a bit of template magic for CUDA☆150Updated last year
- The curated list of awesome C++ Coroutine resources.☆14Updated last year
- This repository contains various examples of using Eigen library☆14Updated 2 months ago
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆23Updated this week
- Reference implementation of the draft C++ GraphBLAS specification.☆32Updated 2 months ago
- Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal - all it takes to sum a lot of numbers fast!☆96Updated 2 months ago
- BGHT: High-performance static GPU hash tables.☆63Updated 2 weeks ago
- Directed Acyclic Graph Execution Engine (DAGEE) is a C++ library that enables programmers to express computation and data movement, as ta…☆45Updated 3 years ago
- cuASR: CUDA Algebra for Semirings☆35Updated 2 years ago
- A simple and fast minimalistic header-only library allowing to run async tasks and execute task graphs.☆53Updated 4 months ago
- Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )☆60Updated last month
- A single header-only C++ library for automatic / algorithmic differentiation.☆12Updated 2 years ago
- Fast and full-featured Matrix Market I/O library for C++, Python, and R☆78Updated 8 months ago
- The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Inte…☆17Updated 6 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- Universal General Propose A* for a GPU platform☆35Updated 8 years ago
- A continuously evolving basic template for cpp development practice.☆19Updated last week