Simple example of how to write an Implicit GEMM Convolution in CUDA using the tensor core WMMA API and bindings for PyTorch.
☆18Jun 29, 2023Updated 2 years ago
Alternatives and similar repositories for implicit-gemm-tensor-core-convolution
Users that are interested in implicit-gemm-tensor-core-convolution are comparing it to the libraries listed below
Sorting:
- a tiny distro-independent package manager written in Rust.☆16Jun 22, 2024Updated last year
- ☆14May 28, 2019Updated 6 years ago
- LLVM alternative in Rust☆15May 20, 2024Updated last year
- A toy C Compiler implemented by Rust.☆19Feb 4, 2023Updated 3 years ago
- Zig uuidv4 implementation without allocations☆11Dec 4, 2024Updated last year
- Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.☆30Feb 12, 2022Updated 4 years ago
- A Wasm-focused mruby VM implementation, built upon pure Rust☆38Updated this week
- ☆32Aug 24, 2022Updated 3 years ago
- CUDA GPU implementation of GMRES iterative Solver☆10Apr 16, 2012Updated 13 years ago
- The simulator for the Next-Generation Championship in Branch Prediction (CBP-NG)☆26Updated this week
- ☆11Sep 4, 2022Updated 3 years ago
- Artifact for 'Register Optimizations for Stencils on GPUs'☆10Sep 18, 2018Updated 7 years ago
- This repository implements a scaled-down LLaMA 2-like model on an ARM Cortex-M3 soft core, with a custom systolic array RTL module for ef…☆11Jun 25, 2025Updated 8 months ago
- 🦌 Deep Retention, Winner @ Calhacks ✨🌠☆10Oct 26, 2024Updated last year
- NVIDIA Compute Unified Device Architecture Toolkit☆15Feb 2, 2026Updated 3 weeks ago
- Ruby script to convert Concepts SVG files into multiple PDF pages☆11Jun 1, 2022Updated 3 years ago
- Custom error type of nom to improve accuracy of error position☆11Mar 23, 2023Updated 2 years ago
- ☆12Jan 19, 2020Updated 6 years ago
- Typed python equivalent for R pipes.☆13Oct 16, 2022Updated 3 years ago
- OpenFOAM right wmake at the right time☆11Mar 10, 2019Updated 6 years ago
- A creative coding environment where Claude can express itself through generative art using p5.js. See tweet thread for examples: https://…☆13Feb 3, 2026Updated 3 weeks ago
- the peachili( Peach + Chili) programming language☆11Feb 2, 2021Updated 5 years ago
- GPGPU version of 数え上げお姉さん(https://github.com/primenumber/kazoeage-oneesan)☆11Dec 3, 2021Updated 4 years ago
- Building or integrating an LLM wrapper shouldn't take more than 10 minutes.☆13Feb 1, 2025Updated last year
- Lambda Calculus parser and interpreter made in TypeScript's type system☆14May 24, 2024Updated last year
- A simple currency converter based on https://fixer.io.☆11May 25, 2018Updated 7 years ago
- ☆10Nov 16, 2024Updated last year
- From-scratch kernel built to serve web pages☆28Sep 27, 2025Updated 5 months ago
- ☆12Jan 13, 2023Updated 3 years ago
- NeonGoby alias analysis checker☆14Jul 2, 2013Updated 12 years ago
- 疲労困憊していても書ける優しいプログラミング言語のコンパイラ☆11Dec 25, 2025Updated 2 months ago
- A simple Python library for compartment models☆11Aug 23, 2021Updated 4 years ago
- A simple library-less CUDA implementation of the OneSweep sorting algorithm.☆11Feb 26, 2024Updated 2 years ago
- Probabilistic numerical finite differences. Compute finite difference weights and differentiation matrices on scattered data sites and wi…☆12May 8, 2023Updated 2 years ago
- Simple HTTP request library for Zig applications☆13Dec 30, 2024Updated last year
- ☆10Apr 25, 2025Updated 10 months ago
- ☆13Jan 18, 2020Updated 6 years ago
- Benchmark of glucose predictive models in diabetes☆11Nov 12, 2024Updated last year
- Source code of the PPoPP '22 paper: "TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs" by Y…☆46May 22, 2024Updated last year