abduld / libwbLinks
☆87Updated 6 years ago
Alternatives and similar repositories for libwb
Users that are interested in libwb are comparing it to the libraries listed below
Sorting:
- Facebook's CUDA extensions.☆284Updated 6 years ago
- ☆74Updated 2 years ago
- Intel(R) Concurrent Collections for C++☆116Updated 3 years ago
- Documentation for StreamExecutor open source proposal☆83Updated 9 years ago
- A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory☆299Updated 7 years ago
- CUDA Data Parallel Primitives Library☆438Updated 7 years ago
- a heterogeneous multiGPU level-3 BLAS library☆46Updated 6 years ago
- Full-speed Array of Structures access☆176Updated 2 years ago
- Parallel Algorithm Scheduling Library☆106Updated 8 years ago
- The StreamIt compiler infrastructure.☆71Updated 9 years ago
- A fast and highly scalable GPU dynamic memory allocator☆112Updated 10 years ago
- A C++ implementaton of MapReduce without distributed filesystem☆267Updated 9 years ago
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆78Updated 5 years ago
- Caffe deep learning framework - optimized for Xeon Phi☆14Updated 10 years ago
- GPUfs - File system support for NVIDIA GPUs☆99Updated 7 years ago
- CL Offline Compiler : Compile OpenCL kernels to HSAIL☆50Updated 8 years ago
- TTC: A high-performance Compiler for Tensor Transpositions☆21Updated 8 years ago
- Benchmarking matrix multiplication implementations☆103Updated 9 years ago
- Tools and extensions for CUDA profiling☆67Updated 6 years ago
- C++ implementation of concurrent Binary Search Trees☆72Updated 10 years ago
- Easy to run kernels using OpenCL☆187Updated 9 months ago
- Benchmark for Co-running Single Applications on Integrated Architectures☆12Updated 9 years ago
- Grappa: scaling irregular applications on commodity clusters☆159Updated 8 years ago
- Flexible GPGPU instrumentation☆89Updated 6 years ago
- PMLS-Caffe: Distributed Deep Learning Framework for Parallel ML System☆193Updated 7 years ago
- Enabling on-the-fly manipulations with LLVM IR code of CUDA sources☆125Updated 9 months ago
- A benchmark of some prominent C/C++ hash table implementations☆104Updated 6 years ago
- Intel(R) Machine Learning Scaling Library is a library providing an efficient implementation of communication patterns used in deep learn…☆108Updated 3 years ago
- Symbolic Expression and Statement Module for new DSLs☆205Updated 5 years ago
- clang with OpenMP 3.1 and some elements of OpenMP 4.0 support☆90Updated 10 years ago