❤️ CUDA/C++ GPU graph analytics simplified.
☆32Sep 19, 2022Updated 3 years ago
Alternatives and similar repositories for essentials
Users that are interested in essentials are comparing it to the libraries listed below
Sorting:
- 🎃 GPU load-balancing library for regular and irregular computations.☆66Sep 9, 2025Updated 6 months ago
- mini is mini☆20Jan 19, 2020Updated 6 years ago
- Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018☆73Oct 5, 2020Updated 5 years ago
- Multi-GPU dynamic scheduler using PGAS style cross-GPU communication☆29Jul 23, 2023Updated 2 years ago
- Open-source library for Graph Streaming. Solves the connected components problem using sub-linear space. Published in SIGMOD'22.☆10Mar 12, 2026Updated last week
- ☆12May 21, 2020Updated 5 years ago
- Use NVIDIA CUPTI from within GO☆10Sep 26, 2019Updated 6 years ago
- Programmable CUDA/C++ GPU Graph Analytics☆1,071Feb 28, 2026Updated 3 weeks ago
- Medusa: Building GPU-based Parallel Sparse Graph Applications with Sequential C/C++ Code☆63Oct 17, 2020Updated 5 years ago
- Three dimensional atmospheric dynamical core using the Gung Ho numerics.☆18Updated this week
- LonestarGPU: Irregular algorithms parallelized for GPUs☆38Nov 11, 2019Updated 6 years ago
- resources pour le cours d'introduction à la programmation des GPUs du mastère spécialisé HPC-AI☆23Jan 11, 2024Updated 2 years ago
- Statistics on GPUs☆33Sep 8, 2025Updated 6 months ago
- (ARCHIVED) Two-stage form compiler☆16Mar 31, 2025Updated 11 months ago
- ☆23Feb 16, 2022Updated 4 years ago
- ☆17Feb 26, 2020Updated 6 years ago
- Sparse matrix-matrix multiplication on CPU+GPU systems.☆13Mar 17, 2014Updated 12 years ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆12Mar 2, 2026Updated 2 weeks ago
- ☆31Aug 28, 2020Updated 5 years ago
- This code base represents "faimGraph: High Performance Management of Fully-dynamic Graphs under tight Memory Constraints on the GPU"☆14Apr 23, 2021Updated 4 years ago
- OCCA Python API: JIT Compilation for Multiple Architectures☆11Dec 20, 2019Updated 6 years ago
- ☆13Nov 4, 2020Updated 5 years ago
- A reference implementation of std::simd, providing data parallel types in the C++ standard☆14Mar 9, 2020Updated 6 years ago
- GPU MemoryManager based on virtualized queues☆27Jun 25, 2022Updated 3 years ago
- Source code for the paper: Accelerating Dynamic Graph Analytics on GPUs☆30Jun 19, 2023Updated 2 years ago
- A Memory-efficient Graph Store for Interactive Queries☆13Sep 1, 2021Updated 4 years ago
- Source code supporting the High Performance Graphics 2022 paper: Supporting Unified Shader Specialization by Co-opting C++ Features☆14Jul 9, 2022Updated 3 years ago
- CUDA Kernel Benchmarking Library☆831Updated this week
- Record GPU memory accesses of a CUDA program and visualize the access pattern in a browser☆13Nov 17, 2020Updated 5 years ago
- SIMD-X: Programming and Processing of Graph Algorithms on GPUs [USENIX ATC '19]☆23Jun 14, 2020Updated 5 years ago
- A tracing JIT compiler for PyTorch☆13Dec 11, 2021Updated 4 years ago
- Galois: C++ library for multi-core and multi-node parallelization☆348May 16, 2024Updated last year
- Efficient and scalable spectral transforms☆24Updated this week
- Lightweight speaker anonymization [IEEE SLT2021]☆27Jun 6, 2022Updated 3 years ago
- A recommendation model kernel optimizing system☆12Jun 5, 2025Updated 9 months ago
- GBDT-based model with efficient unlearning (SIGMOD 2023)☆10Sep 7, 2025Updated 6 months ago
- A blazing fast, MT-safe, lockfree and branchless circular byte buffer for SPSC in 50 loc☆13Sep 16, 2025Updated 6 months ago
- A BUDE virtual-screening benchmark, in many programming models☆30Oct 15, 2024Updated last year
- LU Decomposition using CUDA☆13Dec 7, 2013Updated 12 years ago