lixiuhong/implicit_gemm_convolution

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lixiuhong/implicit_gemm_convolution)

lixiuhong / implicit_gemm_convolution

☆14

Alternatives and similar repositories for implicit_gemm_convolution

Users that are interested in implicit_gemm_convolution are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

UDC-GAC / openCNN
View on GitHub
A Winograd Minimal Filter Implementation in CUDA
☆31Aug 25, 2021Updated 4 years ago
piojanu / CUDA-im2col-conv
View on GitHub
CUDA project for uni subject
☆26Oct 26, 2020Updated 5 years ago
njuhope / cuda_sgemm
View on GitHub
☆121Apr 11, 2024Updated 2 years ago
abhinav-vaishya / Fast-Training-of-Convolutional-Networks-through-FFTs
View on GitHub
Implementation of the paper - Fast Training of Convolutional Networks through FFTs (CUDA for parallelization)
☆10May 8, 2020Updated 6 years ago
ardenma / implicit-gemm-tensor-core-convolution
View on GitHub
Simple example of how to write an Implicit GEMM Convolution in CUDA using the tensor core WMMA API and bindings for PyTorch.
☆19Jun 29, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
md2z34 / winograd_gpu
View on GitHub
GPU implementation of Winograd convolution
☆10Oct 23, 2017Updated 8 years ago
lixiuhong / batched_gemm
View on GitHub
☆40Feb 28, 2020Updated 6 years ago
zeroine / cutlass-cute-sample
View on GitHub
☆49Apr 15, 2024Updated 2 years ago
LucasWilkinson / ASpT-mirror
View on GitHub
Mirror of http://gitlab.hpcrl.cse.ohio-state.edu/chong/ppopp19_ae, refactoring for understanding
☆17Oct 20, 2021Updated 4 years ago
ROCm / rocBLAS-Examples
View on GitHub
Examples illustrating usage of the rocBLAS library
☆17Aug 12, 2024Updated last year
StephanPreibisch / FourierConvolutionCUDALib
View on GitHub
Implementation of 3d non-separable convolution using CUDA & FFT Convolution
☆20Jan 15, 2019Updated 7 years ago
mabdullahsoyturk / HPC-Paper-Notes
View on GitHub
My notes on various HPC papers.
☆27Jan 7, 2023Updated 3 years ago
uuudown / SBNN
View on GitHub
Singular Binarized Neural Network based on GPU Bit Operations (see our SC-19 paper)
☆17Dec 9, 2020Updated 5 years ago
TobeyYang / Yahoo-News-Dataset
View on GitHub
Yahoo! news dataset of DeepCom (EMNLP2019)
☆19Jan 21, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
gthparch / NVPTX-SPIRV-Translator
View on GitHub
☆28Oct 25, 2021Updated 4 years ago
tinyrolls / GLORY
View on GitHub
☆18Jun 24, 2024Updated 2 years ago
OpenMPDK / SMDK-linux-cxl
View on GitHub
Linux kernel source tree of developing SMDK kernel for CXL Memory
☆10Oct 26, 2023Updated 2 years ago
bollu / lz
View on GitHub
A minimal in MLIR dialect along the lines of STG to represent laziness.
☆17Jan 7, 2022Updated 4 years ago
GAMS-dev / gdx
View on GitHub
Official low-level API to access GAMS Data eXchange (GDX) files with bindings to various programming languages
☆14Jul 14, 2026Updated last week
IRVING-L / HttpServer
View on GitHub
☆15May 24, 2022Updated 4 years ago
DmitrySoshnikov / eva-tc-source
View on GitHub
Repository for the "Building a Typechecker from scratch" class
☆15Oct 12, 2023Updated 2 years ago
citizen-erased / relief_mapping
View on GitHub
Relief Mapping Demo
☆13Aug 18, 2011Updated 14 years ago
CSshengxy / MEC
View on GitHub
ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)
☆17Apr 9, 2019Updated 7 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
uclasystem / Mako
View on GitHub
Mako is a low-pause, high-throughput garbage collector designed for memory-disaggregated datacenters.
☆15Sep 2, 2024Updated last year
AyakaGEMM / Hands-on-MLIR
View on GitHub
☆17May 14, 2024Updated 2 years ago
xuqiantong / CUDA-Winograd
View on GitHub
Fast CUDA Kernels for ResNet Inference.
☆183May 26, 2019Updated 7 years ago
HydraQYH / expert_specialization_moe
View on GitHub
Expert Specialization MoE Solution based on CUTLASS
☆27Apr 14, 2026Updated 3 months ago
ucb-bar / cva6-wrapper
View on GitHub
Wrapper for ETH Ariane Core
☆21Sep 2, 2025Updated 10 months ago
randxie / mmdetection-tvm
View on GitHub
mmdetection -> TVM
☆15Aug 22, 2020Updated 5 years ago
VegetableBird10086 / Course_CS231n
View on GitHub
包含作业代码及代码分析
☆10Aug 13, 2021Updated 4 years ago
justine18 / performance_experiment
View on GitHub
Performance experiment - Pyomo vs JuMP
☆12Aug 3, 2023Updated 2 years ago
xinetzone / tvm-book
View on GitHub
☆18Apr 24, 2026Updated 2 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
IcyCC / reimu
View on GitHub
基于EventLoop和多线程的morden cpp 的linux网络库
☆11Apr 5, 2020Updated 6 years ago
AnonymousYWL / MYLIB
View on GitHub
☆18Apr 8, 2022Updated 4 years ago
Liu-xiandong / How_to_optimize_in_GPU
View on GitHub
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…
☆1,329Jul 29, 2023Updated 2 years ago
weifengliu-ssslab / Benchmark_SpGEMM_using_CSR
View on GitHub
CSR-based SpGEMM on nVidia and AMD GPUs
☆48Apr 9, 2016Updated 10 years ago
StrongSpoon / tvm.schedule
View on GitHub
examples for tvm schedule API
☆101Jun 12, 2023Updated 3 years ago
HydraQYH / hp_rms_norm
View on GitHub
High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)
☆30Jan 22, 2026Updated 5 months ago
jeffhammond / PRK
View on GitHub
This is a set of simple programs that can be used to explore the features of a parallel platform.
☆13Apr 20, 2026Updated 3 months ago