tinygrad / gpuctypes
ctypes wrappers for HIP, CUDA, and OpenCL
☆128Updated 7 months ago
Alternatives and similar repositories for gpuctypes:
Users that are interested in gpuctypes are comparing it to the libraries listed below
- Tutorials on tinygrad☆342Updated this week
- Nvidia Instruction Set Specification Generator☆243Updated 7 months ago
- GPUOcelot: A dynamic compilation framework for PTX☆169Updated last week
- ☆72Updated this week
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆107Updated 3 weeks ago
- High-Performance SGEMM on CUDA devices☆76Updated last month
- Sniff CUDA ioctls☆189Updated last year
- RDNA3 emulator☆51Updated 2 weeks ago
- ☆428Updated 2 months ago
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆169Updated last year
- Solve puzzles to improve your tinygrad skills!☆111Updated 5 months ago
- An implementation of delta-iris in tinygrad☆71Updated 6 months ago
- Tensor library with autograd using only Rust's standard library☆65Updated 7 months ago
- Generate python ctypes classes from C headers. Requires LLVM clang☆15Updated 6 months ago
- If tinygrad wasn't small enough for you...☆686Updated 11 months ago
- Attention in SRAM on Tenstorrent Grayskull☆31Updated 7 months ago
- throwaway GPT inference☆140Updated 8 months ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆62Updated 7 months ago
- LLM training in simple, raw C/CUDA☆91Updated 9 months ago
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆118Updated last year
- Scripts and environment for the tinybox☆92Updated 9 months ago
- parallelized hyperdimensional tictactoe☆112Updated 5 months ago
- pytorch from scratch in pure C/CUDA and python☆40Updated 4 months ago
- ☆285Updated last week
- Enabling tinygrad compatibility with the Google Edge TPU☆75Updated 5 months ago
- LLM training in simple, raw C/Metal Shading Language☆47Updated 9 months ago
- Convert StableHLO models into Apple Core ML format☆15Updated 3 weeks ago
- ☆86Updated 11 months ago
- High-Performance FP32 Matrix Multiplication on CPU☆333Updated this week
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆252Updated 2 weeks ago