☆27Apr 7, 2025Updated 11 months ago
Alternatives and similar repositories for matmul-playground
Users that are interested in matmul-playground are comparing it to the libraries listed below
Sorting:
- TensaLang is a Tensor-first programming language, compiler, and runtime that let you write the Model’s inference engine (e.g. LLMs) and s…☆71Feb 20, 2026Updated 2 weeks ago
- hadoop 的 docker 集群配置☆10Jun 8, 2024Updated last year
- Implementaion of Generic L-layer Neural Network from Scratch☆12May 14, 2018Updated 7 years ago
- Hex encode & decode a string, right from your terminal.☆10Jan 5, 2023Updated 3 years ago
- 开源库学习☆10May 10, 2016Updated 9 years ago
- This repo contains the Assignments from Cornell Tech's ECE 5545 - Machine Learning Hardware and Systems offered in Spring 2023☆41May 31, 2023Updated 2 years ago
- An MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE☆17Aug 5, 2022Updated 3 years ago
- For hosting ATS3 and developing CodeDepot☆18Feb 6, 2026Updated last month
- ☆16Sep 7, 2025Updated 6 months ago
- ☆10Jul 22, 2020Updated 5 years ago
- ☆10Sep 3, 2021Updated 4 years ago
- introduction to dataflow analysis using julia☆14Oct 26, 2020Updated 5 years ago
- ☆10Mar 3, 2024Updated 2 years ago
- High-Performance FP32 GEMM on CUDA devices☆117Jan 21, 2025Updated last year
- Standalone commandline CLI tool for compiling Triton kernels☆20Sep 13, 2024Updated last year
- [`CVPR 2024`] Official code repository for " 'Previously On ...' From Recaps to Story Summarization". https://arxiv.org/abs/2405.11487☆13Feb 21, 2025Updated last year
- ☆18Nov 11, 2025Updated 3 months ago
- 32-bit integer only RISC-V core, along with assembler, linker, and compiler from scratch☆23Sep 21, 2025Updated 5 months ago
- This repository contains companion software for the Colfax Research paper "Categorical Foundations for CuTe Layouts".☆114Sep 24, 2025Updated 5 months ago
- ☆14Jan 31, 2020Updated 6 years ago
- 📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).☆69Apr 26, 2025Updated 10 months ago
- New York Times best sellers list with Google Books API☆15Dec 13, 2017Updated 8 years ago
- ☆38May 23, 2025Updated 9 months ago
- tutorials about polyhedral compilation.☆61Feb 9, 2026Updated last month
- Xilinx Modifications to Halide☆13May 3, 2021Updated 4 years ago
- Source code repository accompanying the scientific paper "Finding Efficient Spatial Distributions for Massively Instanced 3-d Models" (S.…☆16Apr 16, 2020Updated 5 years ago
- Experiments on GPT-3's ability to fit numerical models in-context.☆14Aug 11, 2022Updated 3 years ago
- A Deep RL Wordle Bot☆12Dec 6, 2022Updated 3 years ago
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- The Transformer in PyTorch☆13Aug 7, 2024Updated last year
- 收集、分享日常学习使用到的书籍☆18Dec 4, 2019Updated 6 years ago
- libevent based multi-threaded web server☆19Apr 3, 2016Updated 9 years ago
- ☆18Feb 25, 2026Updated last week
- LLVM/MLIR based compiler instrumentation of AMD GPU kernels☆20Jul 13, 2025Updated 7 months ago
- Generate versal system design from ONNX model. AI engine kernels. Sub-microsecond speeds for autoencoders.☆16Dec 29, 2024Updated last year
- A fast full-system simulator of Tenstorrent hardware☆43Feb 27, 2026Updated last week
- An alternative Vivado custom design example (to fully Vitis) for the User Logic Partition targeting VCK5000☆13Jul 16, 2024Updated last year
- Wave: Python Domain-Specific Language for High Performance Machine Learning☆45Updated this week
- CPE change log and release notes☆26Sep 3, 2024Updated last year