eevaain / tiny-tpu-oldLinks
A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.
☆194Updated last year
Alternatives and similar repositories for tiny-tpu-old
Users that are interested in tiny-tpu-old are comparing it to the libraries listed below
Sorting:
- Nvidia Instruction Set Specification Generator☆311Updated last year
- could we make an ml stack in 100,000 lines of code?☆46Updated last year
- A minimal tensor processing unit (TPU), inspired by Google's TPU V2 and V1☆1,161Updated 5 months ago
- Solve puzzles to improve your tinygrad skills!☆178Updated 3 months ago
- ☆88Updated last week
- parallelized hyperdimensional tictactoe☆126Updated last year
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆175Updated last year
- Verilog package manager written in Rust☆144Updated last year
- a mini 2x2 systolic array and PE demo☆68Updated last month
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Updated 10 months ago
- Tensor library with autograd using only Rust's standard library☆71Updated last year
- Run 64-bit Linux on LiteX + RocketChip☆209Updated 3 months ago
- Visualization of cache-optimized matrix multiplication☆157Updated 10 months ago
- Tenstorrent TT-BUDA Repository☆314Updated 10 months ago
- Learning about CUDA by writing PTX code.☆152Updated last year
- Machine-Learning Accelerator System Exploration Tools☆197Updated 2 weeks ago
- Open source machine learning accelerators☆397Updated last year
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆202Updated 2 years ago
- Tutorials on tinygrad☆456Updated 3 months ago
- ☆451Updated 10 months ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- tiny code to access tenstorrent blackhole☆61Updated 8 months ago
- Build infrastructure for class-wide tapeout for 18-224/624 Intro to Open Source Chip Design, Spring 2023☆19Updated 2 years ago
- ☆119Updated 2 years ago
- ☆96Updated last year
- It's a core. Made on Twitch.☆266Updated 4 years ago
- pytorch from scratch in pure C/CUDA and python☆41Updated last year
- Custom PTX Instruction Benchmark☆138Updated 11 months ago
- Tutorials about tinygrad, an end-to-end deep learning stack☆89Updated last week
- a highly efficient compression algorithm for the n1 implant (neuralink's compression challenge)☆47Updated last year