tiny-tpu-v2 / tiny-tpuLinks
A minimal tensor processing unit (TPU), inspired by Google's TPU V2 and V1
☆1,042Updated 3 months ago
Alternatives and similar repositories for tiny-tpu
Users that are interested in tiny-tpu are comparing it to the libraries listed below
Sorting:
- A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.☆189Updated last year
- A machine learning accelerator core designed for energy-efficient AI at the edge.☆1,908Updated this week
- A open source reimplementation of Google's Tensor Processing Unit (TPU).☆712Updated 8 years ago
- ☆303Updated this week
- Run 64-bit Linux on LiteX + RocketChip☆207Updated last month
- Nvidia Instruction Set Specification Generator☆301Updated last year
- Fast and Furious AMD Kernels☆309Updated last week
- Tenstorrent TT-BUDA Repository☆313Updated 8 months ago
- Machine-Learning Accelerator System Exploration Tools☆183Updated last month
- Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.☆408Updated this week
- TT-NN operator library, and TT-Metalium low level kernel programming model.☆1,271Updated this week
- Visualization of cache-optimized matrix multiplication☆156Updated 8 months ago
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆164Updated last week
- Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O☆537Updated 2 months ago
- OpenSource GPU, in Verilog, loosely based on RISC-V ISA☆1,134Updated last year
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆171Updated last year
- This project aims to enable language model inference on FPGAs, supporting AI applications in edge devices and environments with limited r…☆169Updated last year
- Algebraic enhancements for GEMM & AI accelerators☆282Updated 9 months ago
- GPU documentation for humans☆414Updated 2 weeks ago
- ☆1,794Updated 2 weeks ago
- kernels, of the mega variety☆618Updated 2 months ago
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆141Updated this week
- Quantized LLM training in pure CUDA/C++.☆220Updated this week
- Tenstorrent MLIR compiler☆217Updated this week
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆368Updated 7 months ago
- Mirage Persistent Kernel: Compiling LLMs into a MegaKernel☆1,973Updated last week
- ☆75Updated 3 weeks ago
- An MLIR-based toolchain for AMD AI Engine-enabled devices.☆534Updated this week
- ☆113Updated last year
- Unofficial description of the CUDA assembly (SASS) instruction sets.☆173Updated 4 months ago