fengChenHPC / kmeans_cuda
A high performance implementation of kmeans algorithm with cuda
☆18Updated 10 years ago
Related projects ⓘ
Alternatives and complementary repositories for kmeans_cuda
- A CUDA implementation of the PageRank Pipeline Benchmark☆32Updated 7 years ago
- Fork of magma to include more BLAS☆28Updated 7 years ago
- Dolphin - a Deep Learning on MIC architecture Project.☆25Updated 10 years ago
- A minimalistic header only C++11 Neural Network library based on Eigen::Tensor☆20Updated 6 years ago
- Simple and Cutting-edge Deep Learning Library accelerated with GPU using C++ AMP☆19Updated 8 years ago
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 4 years ago
- An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluations☆16Updated 4 years ago
- Deep neural network framework for multiple GPUs☆30Updated 9 years ago
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- SRS - Fast Approximate Nearest Neighbor Search in High Dimensional Euclidean Space With a Tiny Index☆54Updated 9 years ago
- CUDA Matrix Factorization Library with Stochastic Gradient Descent (SGD)☆71Updated 6 years ago
- Test winograd convolution written in TVM for CUDA and AMDGPU☆40Updated 6 years ago
- GraphMat graph analytics framework☆101Updated last year
- Different implementation of sparse matrix multiplication. All matrices are in CSR format. The code contains different CUDA kernels for mu…☆16Updated 14 years ago
- Efficient LDA solution on GPUs.☆24Updated 6 years ago
- High Dimensional Approximate Near(est) Neighbor☆33Updated 7 years ago
- A framework for index based similarity search.☆19Updated 5 years ago
- Proof of concept prototype to perform distributed training using BVLC/caffe, based on a parameter server implementation using MPI. Data p…☆13Updated 9 years ago
- image to column☆31Updated 10 years ago
- The "CUDA templates" are a collection of C++ template classes and functions which provide a consistent interface to NVIDIA's "Compute Uni…☆27Updated 13 years ago
- Deep neural network framework (C/C++/CUDA).☆31Updated 9 years ago
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 7 years ago
- High-Performance Streaming Graph Analytics on GPUs☆31Updated 5 years ago
- HogWild++: A New Mechanism for Decentralized Asynchronous Stochastic Gradient Descent☆33Updated 8 years ago
- ONNX Parser is a tool that automatically generates openvx inference code (CNN) from onnx binary model files.☆17Updated 5 years ago
- CuSha is a CUDA-based vertex-centric graph processing framework that uses G-Shards and CW representations.☆52Updated 9 years ago
- CUDA Sparse-Matrix Vector Multiplication, using Sliced Coordinate format☆20Updated 6 years ago
- CUDA implementation of data clustering using expectation maximization with a Gaussian mixture model. Supports multiple GPUs on a single n…☆26Updated 12 years ago