eliben / cs344
Introduction to Parallel Programming class code
☆30Updated 10 years ago
Alternatives and similar repositories for cs344:
Users that are interested in cs344 are comparing it to the libraries listed below
- Symbolic differentiation engine for optimization-based machine learning models.☆42Updated 7 years ago
- Fork of magma to include more BLAS☆28Updated 8 years ago
- GPU Automatically Tuned Linear Algebra Software☆28Updated 9 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- Sample implementation of a proposed C++ hashing framework☆29Updated 9 years ago
- A GPU / CPU implementation of a feed forward neural network☆31Updated 9 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Updated 7 years ago
- Scientific library for high-precision computations and research☆49Updated 7 years ago
- Proof-of-Concept CNN in Halide☆22Updated 8 years ago
- Parallel Algorithm Scheduling Library☆106Updated 7 years ago
- CNNs in Halide☆23Updated 9 years ago
- Examples from Second Edition of Discovering Modern C++☆22Updated 6 years ago
- Launching collective tasks in bulk☆37Updated 5 years ago
- Deep neural network framework (C/C++/CUDA).☆31Updated 9 years ago
- The "CUDA templates" are a collection of C++ template classes and functions which provide a consistent interface to NVIDIA's "Compute Uni…☆27Updated 13 years ago
- Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL☆135Updated 7 years ago
- Full-speed Array of Structures access☆167Updated last year
- Communication-Minimizing 2D Convolution in GPU Registers☆30Updated 11 years ago
- GPU Optimization and Memory Abstraction Framework☆32Updated 5 years ago
- C++ Summer Lecture Series 2016☆13Updated 8 years ago
- C++ library [machine learning & numerical optimization] - superseeded by libnano☆1Updated 6 years ago
- C++ implementation of concurrent Binary Search Trees☆72Updated 9 years ago
- A Light-weight and Fast Template Matrix Library☆132Updated 12 years ago
- C++ convenience classes to be used with CUDA code, for both the host and the kerlel parts.☆55Updated 6 years ago
- A CUDA implementation of the PageRank Pipeline Benchmark☆32Updated 8 years ago
- This repository contains components that will support percolation via OpenCL and CUDA☆32Updated 3 years ago
- This repository contains easy-to-read Python/CUDA implementations of fundamental GPU computing primitives.☆36Updated 9 years ago
- CMake module collection☆30Updated 9 years ago
- related materials for coursera & edx MOOCs, will no longer update.☆63Updated 8 years ago
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 5 years ago