ChrisCummins / cldriveLinks
πββοΈ Run arbitrary OpenCL kernels
β10Updated last year
Alternatives and similar repositories for cldrive
Users that are interested in cldrive are comparing it to the libraries listed below
Sorting:
- Deep learning program generatorβ107Updated last year
- π "End-to-end Deep Learning of Optimization Heuristics" (π₯ PACT'17 Best Paper)β72Updated 2 years ago
- π "Synthesizing Benchmarks for Predictive Modeling" (π₯ CGO'17 Best Paper)β22Updated 2 years ago
- Source code for "BenchPress: A Deep Active Benchmark Generator", PACT 2022β21Updated 2 years ago
- Enabling on-the-fly manipulations with LLVM IR code of CUDA sourcesβ111Updated 2 months ago
- a Halide language To MLIR compiler.β26Updated 3 years ago
- Kernel Tuning Toolkitβ60Updated last month
- Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)β126Updated 2 years ago
- GPUVerify: a Verifier for GPU Kernelsβ62Updated 2 years ago
- Asynchronous Task and Memory Interface, or ATMI, is a runtime framework and programming model for heterogeneous CPU-GPU systems. It proviβ¦β68Updated last year
- π¨βπ» My PhD.β186Updated 2 years ago
- Neural Code Comprehension: A Learnable Representation of Code Semanticsβ213Updated 7 months ago
- NeuroVectorizer is a framework that uses deep reinforcement learning (RL) to predict optimal vectorization compiler pragmas for for loopsβ¦β94Updated 2 years ago
- Decuda and cudasm, the CUDA binary utilities package. Low-level tools for NVidia G80 GPUs.β102Updated 14 years ago
- Tapir extension to LLVM for optimizing Parallel Programsβ134Updated 5 years ago
- portDNN is a library implementing neural network algorithms written using SYCLβ113Updated last year
- CUDA and OpenMP implementations of C2R/R2C inplace transpositionβ46Updated 10 years ago
- Clover: Quantized 4-bit Linear Algebra Libraryβ114Updated 7 years ago
- Instruction THroughput Estimator using MAchine Learning (ITHEMAL)β147Updated 3 years ago
- β255Updated 3 weeks ago
- MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.comβ38Updated last year
- CUDA kernel author's toolsβ111Updated 3 years ago
- Kernel Fusion and Runtime Compilation Based on NNVMβ70Updated 8 years ago
- CUDA templates for tile-sparse matrix multiplication based on CUTLASS.β51Updated 7 years ago
- GPUOCelot: A dynamic compilation framework for PTXβ287Updated last year
- A source-to-source compiler for automatic parallelization of C programs through code annotation.β62Updated 5 years ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.β133Updated last year
- The SHOC Benchmark Suiteβ256Updated 3 years ago
- Record GPU memory accesses of a CUDA program and visualize the access pattern in a browserβ13Updated 4 years ago
- A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizationsβ318Updated last year