Winograd-based convolution implementation in OpenCL
☆29Jan 22, 2017Updated 9 years ago
Alternatives and similar repositories for Winograd-OpenCL
Users that are interested in Winograd-OpenCL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 理解winograd算法原理☆10Apr 26, 2020Updated 6 years ago
- GPU implementation of Winograd convolution☆10Oct 23, 2017Updated 8 years ago
- ☆25Dec 1, 2016Updated 9 years ago
- Atamai Image Registration and Segmentation☆22Apr 1, 2026Updated 2 months ago
- Fast CUDA Kernels for ResNet Inference.☆183May 26, 2019Updated 7 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A Winograd based kernel for convolutions in deep learning framework☆15Jul 22, 2017Updated 8 years ago
- Instructions and templates for SC authors☆17Aug 22, 2021Updated 4 years ago
- ☆24Mar 22, 2018Updated 8 years ago
- ☆17Jul 1, 2020Updated 5 years ago
- ☆13Mar 29, 2025Updated last year
- Skeletonide is a parallel implementation of Zhang-Suen morphological thinning algorithm written in Halide-lang. Use it for fast skeletoni…☆14Oct 21, 2020Updated 5 years ago
- All in One - Continual Learning☆11May 24, 2023Updated 3 years ago
- Implementation of a Systolic Array based sorting engine on an FPGA using Verilog☆11May 11, 2017Updated 9 years ago
- Useful statistics about your SPIR-V shader modules!☆15Jun 13, 2019Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official implementation of "Searching for Winograd-aware Quantized Networks" (MLSys'20)☆27Oct 3, 2023Updated 2 years ago
- ☆11Sep 3, 2022Updated 3 years ago
- list of papers, code, datasets and other resources☆14Jul 22, 2022Updated 3 years ago
- verilog CNN generator for FPGA☆34Jan 4, 2021Updated 5 years ago
- ☆11Mar 15, 2023Updated 3 years ago
- Library for fast image convolution in neural networks on Intel Architecture☆30Jun 25, 2017Updated 8 years ago
- [AAAI2024] Summarizing Stream Data for Memory-Restricted Online Continual Learning☆21Apr 30, 2024Updated 2 years ago
- Samples from the AMD APP SDK (with OpenCRun support)☆16Jun 10, 2018Updated 8 years ago
- HLS Custom-Precision Floating-Point Library☆13Nov 6, 2017Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Driver for the Explore-NFC card by NXP for the Raspberry Pi (http://www.nxp.com/demoboard/PNEV512R.html)☆20Sep 24, 2014Updated 11 years ago
- GEMM and Winograd based convolutions using CUTLASS☆28Jul 15, 2020Updated 5 years ago
- ROCm Command Line Profiler - Updated moved to https://github.com/GPUOpen-Tools/RCP☆10Aug 24, 2017Updated 8 years ago
- Caffe to VHDL☆68Jun 17, 2020Updated 5 years ago
- first-order deep learning accelerator model☆22Nov 27, 2017Updated 8 years ago
- ☆10Mar 24, 2020Updated 6 years ago
- ncnn android vkpeak☆25May 27, 2026Updated 2 weeks ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆35Jul 28, 2020Updated 5 years ago
- ☆13Nov 1, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆12Sep 28, 2023Updated 2 years ago
- Programming Assignment Project for Information Visualization Course on University of Chinese Academy of Sciences☆12Mar 10, 2017Updated 9 years ago
- Accelerate convolution neural network for face recognition using GPU☆15Nov 24, 2020Updated 5 years ago
- A disitributed implementation of alphafold3 base on xfold and tpp-pytorch-extension☆12Mar 26, 2026Updated 2 months ago
- Unofficial pytorch implementation of Piecewise Linear Unit dynamic activation function☆18Feb 8, 2023Updated 3 years ago
- ☆26Oct 1, 2025Updated 8 months ago
- nnvm&tvm example of cross compilation and deployment in Nvidia Jetson TX2 platform☆11Apr 17, 2018Updated 8 years ago