Winograd-based convolution implementation in OpenCL
☆28Jan 22, 2017Updated 9 years ago
Alternatives and similar repositories for Winograd-OpenCL
Users that are interested in Winograd-OpenCL are comparing it to the libraries listed below
Sorting:
- GPU implementation of Winograd convolution☆10Oct 23, 2017Updated 8 years ago
- Fast CUDA Kernels for ResNet Inference.☆182May 26, 2019Updated 6 years ago
- ☆26Dec 1, 2016Updated 9 years ago
- ☆13Mar 29, 2025Updated 11 months ago
- Skeletonide is a parallel implementation of Zhang-Suen morphological thinning algorithm written in Halide-lang. Use it for fast skeletoni…☆14Oct 21, 2020Updated 5 years ago
- Useful statistics about your SPIR-V shader modules!☆15Jun 13, 2019Updated 6 years ago
- Simple example showing how to use DGMA in OpenCL☆13Feb 11, 2016Updated 10 years ago
- Instructions and templates for SC authors☆17Aug 22, 2021Updated 4 years ago
- ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)☆17Apr 9, 2019Updated 6 years ago
- ☆24Mar 22, 2018Updated 7 years ago
- OpenCL Labs for PAPAA Summer School 2016 Edition☆46Jul 24, 2017Updated 8 years ago
- A simple implementation of syntax highlighting in a VS Code extention.☆23Jan 15, 2017Updated 9 years ago
- Official implementation of "Searching for Winograd-aware Quantized Networks" (MLSys'20)☆27Oct 3, 2023Updated 2 years ago
- Implementation of FusedMM method for IPDPS 2021 paper titled "FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural N…☆31Aug 12, 2022Updated 3 years ago
- Library for fast image convolution in neural networks on Intel Architecture☆30Jun 25, 2017Updated 8 years ago
- Programming Assignment Project for Information Visualization Course on University of Chinese Academy of Sciences☆12Mar 10, 2017Updated 8 years ago
- NIST transition-edge sensor (TES) data acquisition framework☆16Feb 24, 2026Updated last week
- verilog CNN generator for FPGA☆34Jan 4, 2021Updated 5 years ago
- ☆21Nov 12, 2025Updated 3 months ago
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆28Jan 13, 2026Updated last month
- 使用Qt+librviz+ros设计点云显示界面☆11Jan 5, 2022Updated 4 years ago
- A Tensorflow Implementation of VoxNet.☆11Aug 2, 2018Updated 7 years ago
- ☆12Sep 28, 2023Updated 2 years ago
- ☆11Dec 27, 2022Updated 3 years ago
- Community maintained hardware plugin for vLLM on AWS Neuron☆23Updated this week
- enuSpace plugin for Tensorflow (graphical logic block, flow programming)☆11Feb 6, 2020Updated 6 years ago
- Code and performance tests to demonstrate the COUNTLESS algorithm. https://medium.com/@willsilversmith/countless-high-performance-2x-down…☆10Oct 23, 2019Updated 6 years ago
- A simple script to plot the Roofline model for given HW platforms and applications☆10Aug 22, 2024Updated last year
- ☆12Apr 1, 2025Updated 11 months ago
- ☆10Nov 22, 2022Updated 3 years ago
- Implementation of a Systolic Array based sorting engine on an FPGA using Verilog☆11May 11, 2017Updated 8 years ago
- A non-iterative algorithm to reconstruct images from compressively sensed measurements.☆42Oct 24, 2020Updated 5 years ago
- ☆36Mar 6, 2019Updated 6 years ago
- ☆39Feb 4, 2024Updated 2 years ago
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆39Mar 27, 2025Updated 11 months ago
- ☆10Jan 6, 2021Updated 5 years ago
- ☆11Sep 14, 2020Updated 5 years ago
- [ACM MM23] Pytorch implementation for paper: SUG: Single-dataset Unified Generalization for 3D Point Cloud Classification☆12Jul 4, 2023Updated 2 years ago
- Strong (duck) typing for Ruby☆26Nov 20, 2014Updated 11 years ago