fengChenHPC / kmeans_cuda
A high performance implementation of kmeans algorithm with cuda
☆18Updated 10 years ago
Alternatives and similar repositories for kmeans_cuda:
Users that are interested in kmeans_cuda are comparing it to the libraries listed below
- A CUDA implementation of the PageRank Pipeline Benchmark☆32Updated 8 years ago
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- GraphMat graph analytics framework☆101Updated 2 years ago
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 5 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Updated 7 years ago
- Fork of magma to include more BLAS☆28Updated 8 years ago
- CUDA Matrix Factorization Library with Stochastic Gradient Descent (SGD)☆71Updated 7 years ago
- CuSha is a CUDA-based vertex-centric graph processing framework that uses G-Shards and CW representations.☆52Updated 9 years ago
- Dolphin - a Deep Learning on MIC architecture Project.☆25Updated 10 years ago
- Training a Tensorflow graph in C++☆25Updated 8 years ago
- Proximal Asynchronous SAGA☆12Updated 7 years ago
- GPU Optimization and Memory Abstraction Framework☆32Updated 5 years ago
- A Distributed Multi-GPU System for Fast Graph Processing☆65Updated 6 years ago
- Graph Challenge☆31Updated 5 years ago
- The "CUDA templates" are a collection of C++ template classes and functions which provide a consistent interface to NVIDIA's "Compute Uni…☆27Updated 13 years ago
- Proof of concept prototype to perform distributed training using BVLC/caffe, based on a parameter server implementation using MPI. Data p…☆13Updated 9 years ago
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆35Updated 9 years ago
- SRS - Fast Approximate Nearest Neighbor Search in High Dimensional Euclidean Space With a Tiny Index☆55Updated 9 years ago
- CUDA Sparse-Matrix Vector Multiplication, using Sliced Coordinate format☆21Updated 6 years ago
- A framework for pipelined computing on GPU☆29Updated 5 years ago
- An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluations☆16Updated 4 years ago
- A minimalistic header only C++11 Neural Network library based on Eigen::Tensor☆20Updated 7 years ago
- CSR-based SpMV on Heterogeneous Processors (Intel Broadwell, AMD Kaveri and nVidia Tegra K1)☆27Updated 9 years ago
- Deep neural network framework for multiple GPUs☆33Updated 9 years ago
- Test winograd convolution written in TVM for CUDA and AMDGPU☆41Updated 6 years ago
- kmeans☆54Updated 8 years ago
- A Comprehensive Benchmark Suite for Graph Computing☆67Updated 6 years ago
- Medusa: Building GPU-based Parallel Sparse Graph Applications with Sequential C/C++ Code☆61Updated 4 years ago
- Proof-of-Concept CNN in Halide☆22Updated 8 years ago
- Asynchronous Stochastic Gradient Descent with Delay Compensation☆21Updated 7 years ago