eevaain / tiny-tpu
A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.
☆117Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for tiny-tpu
- could we make an ml stack in 100,000 lines of code?☆26Updated 4 months ago
- Solve puzzles to improve your tinygrad skills!☆87Updated 2 months ago
- Because tinygrad got out of hand with line count☆146Updated last month
- Tensor library with autograd using only Rust's standard library☆62Updated 4 months ago
- parallelized hyperdimensional tictactoe☆110Updated 2 months ago
- pytorch from scratch in pure C/CUDA and python☆37Updated last month
- A really tiny autograd engine☆87Updated 7 months ago
- Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆167Updated 3 months ago
- High Quality Resources on GPU Programming/Architecture☆567Updated 3 months ago
- small auto-grad engine inspired from Karpathy's micrograd and PyTorch☆179Updated this week
- Run 64-bit Linux on LiteX + RocketChip☆188Updated 3 months ago
- Nvidia Instruction Set Specification Generator☆215Updated 4 months ago
- Tutorials on tinygrad☆181Updated last week
- An implementation of delta-iris in tinygrad☆71Updated 3 months ago
- a highly efficient compression algorithm for the n1 implant (neuralink's compression challenge)☆45Updated 5 months ago
- Simple Transformer in Jax☆119Updated 5 months ago
- ☆99Updated 7 months ago
- i will automate factorio☆89Updated 3 months ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆126Updated 4 months ago
- ☆47Updated 3 months ago
- Simple Byte pair Encoding mechanism used for tokenization process . written purely in C☆120Updated last week
- Alex Krizhevsky's original code from Google Code☆190Updated 8 years ago
- High-Performance FP32 Matrix Multiplication on CPU☆301Updated last week
- a tiny vectorstore implementation built with numpy.☆56Updated 6 months ago
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆157Updated last year
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆111Updated 7 months ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆61Updated 4 months ago
- Andrej Kapathy's micrograd implemented in c☆29Updated 3 months ago
- Rust Implementation of micrograd☆51Updated 4 months ago
- a tiny multidimensional array implementation in C similar to numpy, but only one file.☆221Updated 3 months ago