NVIDIA / kmeans
kmeans clustering with multi-GPU capabilities
☆114Updated last year
Related projects: ⓘ
- A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory☆287Updated 5 years ago
- Kernel Fusion and Runtime Compilation Based on NNVM☆69Updated 7 years ago
- Python bindings for NVTX☆66Updated last year
- Efficient Top-K implementation on the GPU☆143Updated 5 years ago
- Some CUDA design patterns and a bit of template magic for CUDA☆144Updated last year
- Simple example of implementing a new Tensorflow operation and its gradient in C++.☆56Updated 5 years ago
- Example of how to use CUDA with CMake >= 3.8☆69Updated last year
- ☆20Updated 7 years ago
- This repository contains the results and code for the MLPerf™ Training v0.5 benchmark.☆35Updated last year
- Code for testing the native float16 matrix multiplication performance on Tesla P100 and V100 GPU based on cublasHgemm☆34Updated 5 years ago
- kmeans☆53Updated 8 years ago
- Intel® Optimization for Chainer*, a Chainer module providing numpy like API and DNN acceleration using MKL-DNN.☆162Updated last week
- Full-speed Array of Structures access☆155Updated last year
- CUDA Data Parallel Primitives Library☆418Updated 5 years ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆81Updated 6 months ago
- GPU-specialized parameter server for GPU machine learning.☆100Updated 6 years ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆96Updated 7 years ago
- CUDA by practice☆110Updated 4 years ago
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 7 years ago
- ☆38Updated this week
- Source code that accompanies The CUDA Handbook.☆493Updated 2 years ago
- A CUDNN minimal deep learning training code sample using LeNet.☆257Updated last year
- Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL☆135Updated 7 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆106Updated 3 months ago
- ☆88Updated 7 years ago
- flexible-gemm conv of deepcore☆17Updated 4 years ago
- Tools and extensions for CUDA profiling☆63Updated 4 years ago
- cuDNN sample codes provided by Nvidia☆42Updated 5 years ago
- Introduction to CUDA programming☆111Updated 7 years ago
- ☆127Updated 6 years ago