Optimized Parallel Tiled Approach to perform 2D Convolution by taking advantage of the lower latency, higher bandwidth shared memory as well as global constant memory cached aggresively within GPU thread blocks.
☆15Oct 17, 2017Updated 8 years ago
Alternatives and similar repositories for cuda-tiled-2D-convolution
Users that are interested in cuda-tiled-2D-convolution are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is a simple 2d convolution written in cuda c which uses shared memory for better performance☆19Apr 12, 2018Updated 7 years ago
- Static-sized long-precision arithmetic library for use inside GPU parallelization with CUDA☆11Apr 5, 2025Updated 11 months ago
- ☆11Feb 3, 2026Updated last month
- Extension of Convex.jl for disciplined multiconvex optimization☆10Feb 22, 2017Updated 9 years ago
- "유닉스 리눅스 셸 스크립트 예제 사전: Unix & Linux Shell Script Exercise Dictionary" - 한빛미디어☆10Jan 17, 2017Updated 9 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- This is the shared package to simulate pulse propagation in bulk material (solid and gas) with 3D-UPPE☆13Feb 3, 2026Updated last month
- ☆15Mar 15, 2022Updated 4 years ago
- Matlab mex wrappers to cuSPARSE (NVIDIA)☆11Dec 10, 2025Updated 3 months ago
- ☆11Jun 5, 2024Updated last year
- Simple async job management service using gRPC☆16Apr 23, 2021Updated 4 years ago
- FDTD 3D simulator that generates s-parameters from OFF geometry files using one or more GPUs☆15Jan 16, 2023Updated 3 years ago
- Code Generation Based High Speed Data Serialization Tool☆12Dec 27, 2022Updated 3 years ago
- An open source first-order MATLAB solver for conic programs with row sparsity.☆11May 30, 2017Updated 8 years ago
- ExBLAS: fast, accurate, and reproducible BLAS☆16Sep 13, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- An open-source interface to use the multiple-precision solver SDPA-GMP with YALMIP☆11Apr 8, 2021Updated 4 years ago
- Inline PTX Assembly in CUDA example☆13May 7, 2022Updated 3 years ago
- ☆15Jul 6, 2022Updated 3 years ago
- 에브리바리 쉑더바리 렛츠고바리 컴온바리 ~ ♪ 제주도엔 다금바리 ~ ♪ 디프만엔 에블바리 ~ ♪☆12Nov 20, 2022Updated 3 years ago
- Code for the article "Automatic Temperature Control for Neural Machine Translation" (EMNLP 2018)☆14Apr 16, 2019Updated 6 years ago
- Note of Youtube lecture, "2017 Numerical methods of PDE", given by Qiqi Wang☆14Jun 18, 2018Updated 7 years ago
- This example shows how to perform quantization aware training for transfer learned MobileNet-v2 network.☆12Dec 19, 2023Updated 2 years ago
- ☆16Apr 2, 2023Updated 2 years ago
- Kafka Streams DSL inspired, Stream processing library abstracting pipelines pattern using generic.☆15Dec 21, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Mar 7, 2023Updated 3 years ago
- ROS Waypoints Global Planner☆10Apr 7, 2025Updated 11 months ago
- Integrating Devito operators into PyTorch☆13Mar 17, 2021Updated 5 years ago
- implementation of finite difference frequency domain equations for Maxwell's equations and the exploration of domain decomposition, speci…☆13Oct 21, 2018Updated 7 years ago
- best CPU/GPU sparse solver for large sparse matrices☆21Oct 5, 2021Updated 4 years ago
- A library to define abstract linear operators, and associated algebra and matrix-free algorithms, that works with pyTorch Tensors.☆16Dec 7, 2025Updated 3 months ago
- Implementation of Nesterov and Polyak's (2006) cubic regularization algorithm and Cartis et al's (2011) adaptive cubic regularization alg…☆18Feb 23, 2022Updated 4 years ago
- 2018, 7월 고랭 코리아 밋업 발표자료☆14Jul 26, 2018Updated 7 years ago
- ☆14Mar 29, 2022Updated 3 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- 基于Java+Springboot+Vue的实验室预约系统(源代码+数据库) 本项目前后端分离,本系统分为管理员、教师、学生三种角色 ### 1、学生: 1.登录,注册 2.实验室列表 3.实验室预约 4.查看预约进度并取消 5.查看公告 6.订阅课程 7.实验室报修 8.…☆15Dec 14, 2023Updated 2 years ago
- Certifiably globally optimal unit quaternion rotation averaging via Sparse Bounded-degree sum of squares optimization.☆17Apr 4, 2019Updated 6 years ago
- Set a ROS navigation goal using latitude and longitude.☆10Nov 22, 2020Updated 5 years ago
- ☆18Jun 9, 2021Updated 4 years ago
- ☆16Jun 13, 2022Updated 3 years ago
- Cloud Barista's Coffeehouse is an open space for open-minded people who want to share and discuss technical knowledge for a great and hap…☆14Aug 8, 2025Updated 7 months ago
- This example starts with a simple sum reduction in CUDA, then steps through a series of optimizations we can perform to improve its perfo…☆14Jun 8, 2020Updated 5 years ago