SimeonEhrig / CUDA-Runtime-InterpreterLinks
It's a prototype for an interpreter, which can interpret the host code of a CUDA Program, written with the runtime API.
☆9Updated 5 years ago
Alternatives and similar repositories for CUDA-Runtime-Interpreter
Users that are interested in CUDA-Runtime-Interpreter are comparing it to the libraries listed below
Sorting:
- The repository contains container recipes to build the entire stack of Xeus-Cling and Cling including cuda extension with just a few comm…☆9Updated 4 years ago
- CuPy Benchmark☆12Updated 6 years ago
- Legate Hello World Pedagogical Library☆10Updated 2 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- Easy to use benchmarks for linear algebra frameworks☆24Updated 5 years ago
- An alternative to Boost.MPI for a user friendly C++ interface for MPI (MPICH).☆19Updated 7 years ago
- Scientific algorithms implemented on top of the x-stack (xtensor, xsimd ...)☆9Updated 6 years ago
- ☆11Updated 2 years ago
- This repository contains components that will support percolation via OpenCL and CUDA☆32Updated 3 years ago
- tokenizer and parser for circle projects☆11Updated 5 years ago
- C++ User interface for the Platform independent Library Alpaka☆38Updated 9 months ago
- ☆14Updated 2 years ago
- Experimental ranges for CUDA☆24Updated 6 years ago
- npcomp - An aspirational MLIR based numpy compiler☆51Updated 4 years ago
- VIGRA2 based on xtensor☆10Updated 7 years ago
- Library for the GPU-accelerated spatial indexing and processing of particles in 2D and 3D with OpenCL. Currently offers trees based on sp…☆27Updated 10 months ago
- GPU Automatically Tuned Linear Algebra Software☆28Updated 9 years ago
- The parallel API to be utilized by AllScale projects to express parallelism.☆9Updated 6 years ago
- A mirror of cinch's internal gitlab repository.☆22Updated 2 years ago
- An implementation of ARMCI using MPI one-sided communication (RMA)☆14Updated 8 months ago
- Generate and execute native code at run time, from Python☆53Updated last month
- Range-based for loops to iterate over a range of numbers or values☆35Updated 8 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Updated 7 years ago
- Cooperative Primitives for CUDA C++ Kernel Authors. This repository contains CUB PRs from Q4 2019 until Q4 2020.☆22Updated 4 years ago
- Yaksa: High-performance Noncontiguous Data Management☆13Updated 8 months ago
- A thread safe simple C++ wrapper for FFTW & MKL☆15Updated 3 years ago
- C++ Header-Only Library for High-Performance Tensor-Vector Multiplication☆21Updated 5 months ago
- Automatic differentiation with uarray/unumpy.☆16Updated 4 years ago
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- Portable and Flexible DGEMM Library for GPUs (OpenCL, CUDA, CAL) with special support for HPL☆17Updated 7 years ago