jeremad / cuda-travis
☆19Updated 5 years ago
Alternatives and similar repositories for cuda-travis
Users that are interested in cuda-travis are comparing it to the libraries listed below
Sorting:
- ☆29Updated this week
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆35Updated last month
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- Training examples for SYCL☆42Updated 2 weeks ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆68Updated 2 years ago
- Distributed View Extension for Kokkos☆45Updated 5 months ago
- A task benchmark☆42Updated 9 months ago
- HTML/JS port of CUDA Occupancy Calculator☆17Updated 3 years ago
- CUDA kernel author's tools☆111Updated 3 years ago
- A unified framework across multiple programming platforms☆37Updated 11 months ago
- Distributed Performance-portable Stencil Compuitation☆10Updated last year
- sparse matrix pre-processing library☆82Updated last year
- The C++ Standard Library for your entire system.☆17Updated 3 weeks ago
- Range-based for loops to iterate over a range of numbers or values☆35Updated 8 years ago
- Codeplay project for contributions to the LLVM SYCL implementation☆30Updated 4 years ago
- ☆14Updated 4 years ago
- Implementation of AMD HIP for CPUs☆22Updated 4 years ago
- A library for C++/Fortran computer simulations (e.g. stencil codes, mesh-free, unstructured grids, n-body & particle methods). Scales fro…☆40Updated 4 years ago
- A library to benchmark CUDA code, similar to google benchmark.☆28Updated 4 years ago
- Header-only C++20 wrapper for MPI 4.0.☆16Updated last year
- Examples for using SYCL on CUDA☆62Updated 2 months ago
- This tool serves as a test harness for different optimization techniques to improve stencil computations performance in shared and distri…☆20Updated 2 years ago
- HIP back-end for Thrust that has been replaced by rocThrust☆28Updated 2 years ago
- Use CUDA intrinsics with user-defined types☆47Updated 10 years ago
- GPU Code optimizer for stencil computations. Refer to our IPDPS'19 paper for more details☆24Updated 5 years ago
- Sympiler is a Code Generator for Transforming Sparse Matrix Codes☆42Updated last year
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- Advanced Profiling and Analytics for AMD Hardware☆154Updated this week
- Comb is a communication performance benchmarking tool.☆24Updated 2 years ago
- Distributed Communication-Optimal LU-factorization Algorithm☆12Updated 3 years ago