mkauers / matrix-multiplicationLinks
Matrix multiplication schemes
☆198Updated 4 months ago
Alternatives and similar repositories for matrix-multiplication
Users that are interested in matrix-multiplication are comparing it to the libraries listed below
Sorting:
- Quantum computing without the linear algebra☆76Updated 3 months ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- The Finite Field Assembly Programming Language☆36Updated 4 months ago
- Meta-GPU lesson covering general aspects of GPU programming as well as specific frameworks☆89Updated 2 weeks ago
- A package for defining deep learning models using categorical algebraic expressions.☆61Updated last year
- Tensor library with autograd using only Rust's standard library☆69Updated last year
- Custom PTX Instruction Benchmark☆127Updated 6 months ago
- parallelized hyperdimensional tictactoe☆125Updated last year
- Visualization of cache-optimized matrix multiplication☆155Updated 6 months ago
- Exocompilation for productive programming of hardware accelerators☆659Updated this week
- The Cosmos numerical relativity code (with unstructured AMR)☆20Updated last year
- ☆138Updated last year
- Learning about CUDA by writing PTX code.☆135Updated last year
- RDNA3 emulator☆54Updated 5 months ago
- Accelerated General (FP32) Matrix Multiplication from scratch in CUDA☆139Updated 8 months ago
- a categorical deep learning compiler☆205Updated this week
- Train neural networks that distill into logic circuits, using JAX☆62Updated 3 months ago
- Learn GPU Programming in Mojo🔥 by Solving Puzzles☆136Updated this week
- FP4 MAC Array☆19Updated last year
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆356Updated 5 months ago
- A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.☆183Updated last year
- tiny code to access tenstorrent blackhole☆59Updated 3 months ago
- ☆63Updated last week
- LLM training in simple, raw C/CUDA☆104Updated last year
- A massively parallel, optimal functional runtime in Rust☆31Updated last year
- GPU-accelerated compiler☆351Updated last year
- ☆103Updated 9 months ago
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆213Updated last year
- Machine Learning with Symbolic Tensors☆340Updated 3 months ago
- ☆30Updated last year