divyanshu-talwar / Parallel-DFS
CUDA implementation of parallel Depth First Search (DFS) algorithm and it's comparison with a serial C++ DFS implementation.
☆29Updated 6 years ago
Alternatives and similar repositories for Parallel-DFS:
Users that are interested in Parallel-DFS are comparing it to the libraries listed below
- Code for paper "Engineering a High-Performance GPU B-Tree" accepted to PPoPP 2019☆55Updated 2 years ago
- GPU B-Tree with support for versioning (snapshots).☆47Updated 5 months ago
- A warp-oriented dynamic hash table for GPUs☆73Updated last year
- A Library for fast Hash Tables on GPUs☆115Updated 2 years ago
- Implementation of parallel Breadth First Algorithm for graph traversal using CUDA and C++ language.☆32Updated 5 years ago
- Implementation of breadth first search on GPU with CUDA Driver API.☆48Updated 3 years ago
- a CUDA implementation of a priority queue☆84Updated 4 years ago
- Concurrent CPU-GPU Programming using Task Models☆101Updated 5 years ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆50Updated 2 weeks ago
- IMPACT GPU Algorithms Teaching Labs☆57Updated last year
- Learn OpenMP examples step by step☆91Updated 2 months ago
- Some CUDA design patterns and a bit of template magic for CUDA☆150Updated last year
- A Toolkit for Programming Parallel Algorithms on Shared-Memory Multicore Machines☆354Updated 3 months ago
- ☆32Updated 4 years ago
- BGHT: High-performance static GPU hash tables.☆62Updated 6 months ago
- ☆47Updated 2 years ago
- Multi-GPU dynamic scheduler using PGAS style cross-GPU communication☆28Updated last year
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆23Updated 2 months ago
- Asynchronous Multi-GPU Programming Framework☆46Updated 3 years ago
- Stencil Probe - a stencil microbenchmark☆30Updated 12 years ago
- An implementation of parallel exclusive scan in CUDA☆62Updated 7 years ago
- Profiling Taskflow Programs through Visualization☆50Updated 2 years ago
- A minimalistic header only C++11 Neural Network library based on Eigen::Tensor☆20Updated 7 years ago
- CUDA kernel author's tools☆111Updated 2 years ago
- Parallel Tasking Library (PTL) - Lightweight C++11 mutilthreading tasking system featuring thread-pool, task-groups, and lock-free task q…☆45Updated 4 months ago
- NUMA-aware multi-CPU multi-GPU data transfer benchmarks☆23Updated last year
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- Repository holding the code base to AC-SpGEMM : "Adaptive Sparse Matrix-Matrix Multiplication on the GPU"☆28Updated 4 years ago
- Fast and full-featured Matrix Market I/O library for C++, Python, and R☆76Updated 7 months ago
- A 128 bit unsigned integer class for CUDA☆45Updated 3 months ago