tinygrad / gpuctypes
ctypes wrappers for HIP, CUDA, and OpenCL
☆129Updated 10 months ago
Alternatives and similar repositories for gpuctypes:
Users that are interested in gpuctypes are comparing it to the libraries listed below
- Learning about CUDA by writing PTX code.☆129Updated last year
- Nvidia Instruction Set Specification Generator☆260Updated 10 months ago
- High-Performance SGEMM on CUDA devices☆90Updated 3 months ago
- RDNA3 emulator☆54Updated 3 weeks ago
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆47Updated this week
- Custom PTX Instruction Benchmark☆123Updated 2 months ago
- Tensor library with autograd using only Rust's standard library☆67Updated 10 months ago
- Tutorials on tinygrad☆374Updated last month
- ☆83Updated this week
- An implementation of delta-iris in tinygrad☆72Updated 8 months ago
- Tenstorrent MLIR compiler☆122Updated this week
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆180Updated last year
- tenstorrent kernel from twitch☆27Updated last year
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆131Updated last year
- GPUOcelot: A dynamic compilation framework for PTX☆187Updated 3 months ago
- Attention in SRAM on Tenstorrent Grayskull☆35Updated 9 months ago
- Solve puzzles to improve your tinygrad skills!☆123Updated last month
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆261Updated this week
- pytorch from scratch in pure C/CUDA and python☆40Updated 7 months ago
- LLM training in simple, raw C/Metal Shading Language☆54Updated last year
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆62Updated 10 months ago
- LLM training in simple, raw C/CUDA☆94Updated last year
- Reference Kernels for the Leaderboard☆43Updated this week
- parallelized hyperdimensional tictactoe☆117Updated 8 months ago
- ☆33Updated this week
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆349Updated 2 weeks ago
- ☆31Updated 4 months ago
- ☆444Updated last month
- Enabling tinygrad compatibility with the Google Edge TPU☆77Updated 8 months ago
- ⭐️ TTNN Compiler for PyTorch 2 ⭐️ It enables running PyTorch models on Tenstorrent hardware using torch.compile path☆36Updated this week