wichtounet / mnist
Simple C++ reader for MNIST dataset
☆82Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for mnist
- CUDA kernel author's tools☆109Updated 2 years ago
- Full-speed Array of Structures access☆161Updated last year
- A CUDNN minimal deep learning training code sample using LeNet.☆263Updated last year
- ☆41Updated 6 years ago
- C++ convenience classes to be used with CUDA code, for both the host and the kerlel parts.☆55Updated 6 years ago
- Parallel Algorithm Scheduling Library☆103Updated 7 years ago
- a CUDA implementation of a priority queue☆81Updated 4 years ago
- Simple utilities to enable code reuse and portability between CUDA C/C++ and standard C/C++.☆343Updated 2 years ago
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆518Updated 6 months ago
- Demonstration of various hardware effects on CUDA GPUs.☆358Updated last year
- AVX-optimized sin(), cos(), exp() and log() functions☆113Updated 2 years ago
- Task graph-based asynchronous programming system using C++ coroutine☆84Updated 9 months ago
- Cooperative Primitives for CUDA C++ Kernel Authors. This repository contains CUB PRs from Q4 2019 until Q4 2020.☆22Updated 4 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆108Updated 6 months ago
- Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.☆70Updated 9 years ago
- Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!☆73Updated 6 months ago
- CUDA Data Parallel Primitives Library☆421Updated 6 years ago
- Symbolic Expression and Statement Module for new DSLs☆206Updated 4 years ago
- Blazing-fast Expression Templates Library (ETL) with GPU support, in C++☆221Updated 11 months ago
- Header-only C++ library for low precision floating point type emulation.☆163Updated 4 years ago
- Thin, unified, C++-flavored wrappers for the CUDA APIs☆797Updated this week
- Compile time mathematic functions for C++14☆186Updated 3 years ago
- Range-based for loops to iterate over a range of numbers or values☆35Updated 7 years ago
- ulmBLAS☆104Updated 2 years ago
- A cross-platform CUDA/C++17 starter project with google test and google benchmark support.☆37Updated last year
- Example of how to use CUDA with CMake >= 3.8☆69Updated last year
- ☆486Updated this week
- Some CUDA design patterns and a bit of template magic for CUDA☆146Updated last year
- ☆132Updated last year
- Convert CUDA programs from float data type to half or half2 with SIMDization☆20Updated 5 years ago