DEShawResearch / random123
Counter-based random number generators for C, C++ and CUDA.
☆98Updated last year
Alternatives and similar repositories for random123
Users that are interested in random123 are comparing it to the libraries listed below
Sorting:
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆87Updated this week
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆46Updated 10 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆108Updated last week
- BLAS++ is a C++ wrapper around CPU and GPU BLAS (basic linear algebra subroutines), developed as part of the SLATE project.☆78Updated 2 weeks ago
- Kokkos C++ Performance Portability Programming Ecosystem: Profiling and Debugging Tools☆123Updated 2 weeks ago
- Header-only C++20 wrapper for MPI 4.0.☆46Updated last year
- SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) sy…☆114Updated 4 months ago
- DLA-Future☆73Updated this week
- Use CUDA intrinsics with user-defined types☆47Updated 10 years ago
- CUDA kernel author's tools☆111Updated 3 years ago
- mallocMC: Memory Allocator for Many Core Architectures☆55Updated this week
- Reference implementation of the draft C++ GraphBLAS specification.☆32Updated 2 months ago
- Copy-hiding array abstraction to automatically migrate data between memory spaces☆107Updated last week
- A mirror of the CRLibm project from INRIA Forge☆46Updated 4 years ago
- Basic Tensor Algebra Subroutines☆48Updated 3 weeks ago
- A fast implementation of log() and exp()☆53Updated 2 years ago
- Codeplay project for contributions to the LLVM SYCL implementation☆30Updated 4 years ago
- Partitioned Global Address Space (PGAS) library for distributed arrays☆103Updated last week
- Reproducible random number generation for parallel computations☆30Updated last week
- Synchronous, single-threaded, library-only SYCL implementation for debugging and verification.☆35Updated 2 weeks ago
- A C++17 message passing library based on MPI☆169Updated last week
- SYCL Conformance Tests☆70Updated last week
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆72Updated last month
- Implementation of AMD HIP for CPUs☆22Updated 4 years ago
- An implementation of HIP that works on CPUs, across OSes.☆116Updated last year
- C++ template library for floating point operations☆27Updated 2 weeks ago
- Distributed ranges is a generalization of C++ ranges for distributed data structures.☆50Updated 2 weeks ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆44Updated this week
- Reusable software components for ROCm developers☆83Updated this week