fengChenHPC / kmeans_cuda
A high performance implementation of kmeans algorithm with cuda
☆18Updated 10 years ago
Alternatives and similar repositories for kmeans_cuda:
Users that are interested in kmeans_cuda are comparing it to the libraries listed below
- A CUDA implementation of the PageRank Pipeline Benchmark☆32Updated 8 years ago
- CUDA Sparse-Matrix Vector Multiplication, using Sliced Coordinate format☆21Updated 6 years ago
- Proximal Asynchronous SAGA☆12Updated 7 years ago
- Fork of magma to include more BLAS☆28Updated 8 years ago
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 5 years ago
- A minimalistic header only C++11 Neural Network library based on Eigen::Tensor☆20Updated 7 years ago
- CUDA Matrix Factorization Library with Stochastic Gradient Descent (SGD)☆71Updated 7 years ago
- A C++ library for Linear and Logistic Regression.☆8Updated 7 years ago
- Efficient LDA solution on GPUs.☆24Updated 6 years ago
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- Log-Bilinear Document Model☆18Updated 13 years ago
- Simple and Cutting-edge Deep Learning Library accelerated with GPU using C++ AMP☆19Updated 9 years ago
- Efficient graph clustering software for normalized cut and ratio association on undirected graphs. Copyright(c) 2008 Brian Kulis, Yuqiang…☆22Updated 12 years ago
- A framework for index based similarity search.☆19Updated 5 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Updated 7 years ago
- High Dimensional Approximate Near(est) Neighbor☆33Updated 7 years ago
- HogWild++: A New Mechanism for Decentralized Asynchronous Stochastic Gradient Descent☆33Updated 8 years ago
- ☆12Updated 4 years ago
- CSR-based SpMV on Heterogeneous Processors (Intel Broadwell, AMD Kaveri and nVidia Tegra K1)☆27Updated 9 years ago
- The "CUDA templates" are a collection of C++ template classes and functions which provide a consistent interface to NVIDIA's "Compute Uni…☆27Updated 13 years ago
- Sources for OpenCL and CUDA tutorials. http://jlaning.com☆20Updated 9 years ago
- Test winograd convolution written in TVM for CUDA and AMDGPU☆41Updated 6 years ago
- Sparse matrix computation library for GPU☆56Updated 4 years ago
- image to column☆30Updated 10 years ago
- CUDA implementation of k-means☆23Updated 11 years ago
- High-Performance Streaming Graph Analytics on GPUs☆31Updated 6 years ago
- Asynchronous Stochastic Gradient Descent with Delay Compensation☆21Updated 7 years ago
- Artifact of paper "Exploiting Recent SIMD Architectural Advances for Irregular Applications"☆11Updated 8 years ago
- Sparse-dense matrix-matrix multiplication on GPUs☆14Updated 6 years ago
- Proof of concept prototype to perform distributed training using BVLC/caffe, based on a parameter server implementation using MPI. Data p…☆13Updated 9 years ago