Different implementation of sparse matrix multiplication. All matrices are in CSR format. The code contains different CUDA kernels for multiply sparse matrix vs dense vector and sparse matrix vs another sparse matrix. It contains several cuda kernel for sparse matrix dense vector product and sparse matrix sparse matrix product.
☆17Nov 15, 2010Updated 15 years ago
Alternatives and similar repositories for CudaDotProd
Users that are interested in CudaDotProd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SpMV using CUDA☆20Mar 5, 2018Updated 8 years ago
- Homemade Pixel Art Tool (WIP)☆17Oct 18, 2024Updated last year
- A simple but efficient C++ thread/worker pool library for asynchronous task management.☆10Jul 11, 2023Updated 2 years ago
- CS510 Advanced Topics in Concurrency Project☆16Jun 4, 2020Updated 5 years ago
- A cache that automatically removes the least-recently-used items☆18Dec 16, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Ring network model test to demonstrate the use of CoreNEURON☆11Aug 19, 2025Updated 7 months ago
- An Android app that uses OpenCL to perform spatial filtering☆21Mar 28, 2013Updated 13 years ago
- W3C sitesindeki SQL Editörünün, Türkçe veritabanı ile hazırlanmış halidir.☆11Dec 10, 2015Updated 10 years ago
- C++11 Header-only continuous-storage Double ended vector implementation similar to STL's std::vector for efficient insertions/removals at…☆16Dec 29, 2022Updated 3 years ago
- Python caching libraries benchmark - which is better?☆12Sep 27, 2025Updated 6 months ago
- A CUDA-C implementation of FOFE and FSMN☆19Aug 5, 2016Updated 9 years ago
- 第二届云原生编程挑战赛: RocketMQ存储系统设计 第4名 我之渺小 队代码☆11Nov 3, 2021Updated 4 years ago
- ☆10Aug 4, 2022Updated 3 years ago
- Sparse Recurrent Neural Networks -- Pruning Connections and Hidden Sizes (TensorFlow)☆74Jul 25, 2020Updated 5 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- An implementation of the Pregel graph processing system on the Spark cluster computing framework. Merged into Spark; please see:☆11Apr 9, 2011Updated 15 years ago
- Webcam Image Processing with CUDA using OpenCV☆16Aug 30, 2014Updated 11 years ago
- hybrid computing engine executed by both GPU and multicore to accelerate PH matrix reduction☆13Dec 2, 2019Updated 6 years ago
- Parallel/GPU level set volume segmentation using OpenCL☆18Apr 23, 2019Updated 6 years ago
- Tomasulo Simulator written in React as the project for Computer Architecture course, Spring 2019, Tsinghua University☆11Jun 9, 2019Updated 6 years ago
- http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/36266.pdf☆14Apr 25, 2012Updated 13 years ago
- Serving Images dynamically based on the client device is an important part of Web Page Resource Optimization. ImgR.NET aims at automating…☆12Sep 28, 2016Updated 9 years ago
- 3D Haar Descrete Wavelet Transform C++11 library (using OpenMP and SSE/SSE2/SSE3/AVX).☆17May 26, 2017Updated 8 years ago
- Code for "On Long-Tailed Phenomena in NMT".☆10Jan 10, 2021Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- C++ header-only library to create classe factories registered by name.☆23Nov 27, 2018Updated 7 years ago
- OpenCL porting of the GROMACS molecular simulation toolkit☆27Sep 5, 2015Updated 10 years ago
- Vulkan implementations of Subsurface Scattering and Ambient Occlusion☆16Jun 4, 2017Updated 8 years ago
- ☆11Apr 2, 2021Updated 5 years ago
- Gale&Church (1993) sentence alignment☆16May 9, 2020Updated 5 years ago
- Implementation of the SHA-3 family using AVX/AVX2 instructions.☆14Oct 5, 2018Updated 7 years ago
- Deep learning model of machine translation using attentional and structural biases☆13Jul 21, 2017Updated 8 years ago
- Real time 2D simulator of fluid mechanics in C++/Qt/OpenGl☆26Jul 31, 2014Updated 11 years ago
- Cubesharp is a 3D modelling software written in pure C# and OpenGL. (WARNING: Extremely Unstable)☆19Jun 5, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- UltraFast GPU Grammar eXtractor for Machine Translation (He et al., TACL 2015 & NAACL 2013)☆12Jun 19, 2015Updated 10 years ago
- 3D smoke simulation by Eulerian grid method for solving Navier-Stokes equation and rendering by volume ray-casting, all in C++ AMP☆19Feb 6, 2020Updated 6 years ago
- Whippletree, a novel approach to scheduling dynamic, irregular workloads on the GPU☆23Nov 24, 2015Updated 10 years ago
- Simple lib for easy useage of RSA encryption of form fields in php / javascript based web apps.☆20Dec 11, 2015Updated 10 years ago
- Sparse matrix computation library for GPU☆59Jul 12, 2020Updated 5 years ago
- A CLI to create sets of responsive images for the web☆19Apr 7, 2026Updated last week
- Code for the paper Faster Phrase-Based Decoding by Refining Feature State☆14Jan 9, 2023Updated 3 years ago