☆16Sep 24, 2024Updated last year
Alternatives and similar repositories for py-codegen
Users that are interested in py-codegen are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆20May 24, 2025Updated 10 months ago
- Example of binding a TF32 CUTLASS GEMM kernel to PyTorch☆12Jun 7, 2024Updated last year
- Programming Gemm Kernels on NVIDIA GPUs with Tensor Cores in Julia☆43Mar 30, 2026Updated last week
- ☆57Feb 24, 2026Updated last month
- modified cutlass☆15Oct 26, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Inference Llama 2 with a model compiled to native code by TorchInductor☆14Feb 8, 2024Updated 2 years ago
- A fast implementation of log() and exp()☆57Dec 14, 2022Updated 3 years ago
- extensible collectives library in triton☆98Mar 31, 2025Updated last year
- This repository contains companion software for the Colfax Research paper "Categorical Foundations for CuTe Layouts".☆123Sep 24, 2025Updated 6 months ago
- ☆13Jun 20, 2019Updated 6 years ago
- study of cutlass☆22Nov 10, 2024Updated last year
- SParse AcceleRation on Tensor Architecture☆18Apr 7, 2025Updated last year
- Triton-based Symmetric Memory operators and examples☆94Mar 28, 2026Updated last week
- The simplest but fast implementation of matrix multiplication in CUDA.☆40Jul 26, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆14Feb 5, 2018Updated 8 years ago
- Unofficial mirror of pdftk - imported using git-ubuntu☆10Aug 20, 2018Updated 7 years ago
- ☆12May 23, 2018Updated 7 years ago
- cuASR: CUDA Algebra for Semirings☆45Aug 22, 2022Updated 3 years ago
- Benchmark framework of compute-in-memory based accelerators for deep neural network (inference engine focused)☆22Jun 1, 2021Updated 4 years ago
- GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs☆16Apr 18, 2025Updated 11 months ago
- Bazel rules for interacting with bazel build artifacts and bringing them into your workspace☆10Jul 24, 2024Updated last year
- C-compatible enum for Julia☆15Dec 23, 2023Updated 2 years ago
- Fork of Enzyme to work on Reverse-Mode Differentiation at the MLIR-level.☆11Apr 23, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Automatically exported from code.google.com/p/a-roofline-model-of-energy-ubenchmarks☆12Jul 14, 2020Updated 5 years ago
- ☆19Oct 3, 2022Updated 3 years ago
- Code and experiments for the NeurIPS 2023 paper Stabilized Neural Differential Equations for Learning Dynamics with Explicit Constraints☆12Mar 26, 2024Updated 2 years ago
- TLB Benchmarks☆35Sep 11, 2017Updated 8 years ago
- My mum's recipes. In Polish.☆12Apr 23, 2020Updated 5 years ago
- An extension library of WMMA API (Tensor Core API)☆111Jul 12, 2024Updated last year
- SIMDized check which bytes are in a set☆28Oct 21, 2018Updated 7 years ago
- jwilder/nginx-proxy and nginx-proxy/docker-letsencrypt-nginx-proxy-companion launched by docker-compose.☆10Aug 31, 2020Updated 5 years ago
- A library for accelerating data compression using Intel® QAT.☆21Feb 26, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆338Apr 3, 2026Updated last week
- The simplest, cheapest lambda-hosted URL shortener☆27Feb 5, 2026Updated 2 months ago
- Region-level profiling for CUDA kernels with trace, NVBit, CUPTI, and an interactive Explorer.☆103Mar 27, 2026Updated 2 weeks ago
- The Core Registry of Container Blueprints for the Autamus Build System☆15Mar 14, 2023Updated 3 years ago
- dotfiles☆19Dec 27, 2024Updated last year
- Collection of tools/data used for reverse engineering Nintendo Switch sysmodules with Ghidra☆18Oct 14, 2024Updated last year
- Attention in SRAM on Tenstorrent Grayskull☆38Jul 18, 2024Updated last year