geohot / tt-tinyLinks
tiny code to access tenstorrent blackhole
☆55Updated last month
Alternatives and similar repositories for tt-tiny
Users that are interested in tt-tiny are comparing it to the libraries listed below
Sorting:
- ☆35Updated this week
- Tensor library with autograd using only Rust's standard library☆68Updated last year
- ☆47Updated last week
- RDNA3 emulator☆54Updated 2 months ago
- Because it's there.☆16Updated 9 months ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆80Updated this week
- An implementation of delta-iris in tinygrad☆72Updated 10 months ago
- The Finite Field Assembly Programming Language☆36Updated last month
- High-Performance SGEMM on CUDA devices☆97Updated 5 months ago
- Custom PTX Instruction Benchmark☆126Updated 4 months ago
- asynchronous/distributed speculative evaluation for llama3☆39Updated 11 months ago
- Learning about CUDA by writing PTX code.☆133Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆160Updated this week
- It's a baby compiler. (Lean btw.)☆16Updated last month
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆96Updated last month
- SIMD quantization kernels☆73Updated this week
- ☆22Updated last month
- This repository contain the simple llama3 implementation in pure jax.☆67Updated 5 months ago
- ☆248Updated last year
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Updated 3 months ago
- Rust Implementation of micrograd☆52Updated last year
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆211Updated last year
- Tensor library & inference framework for machine learning☆101Updated last week
- ☆13Updated last month
- Standalone commandline CLI tool for compiling Triton kernels☆17Updated 10 months ago
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆58Updated last year
- Samples of good AI generated CUDA kernels☆84Updated last month
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆126Updated 2 months ago
- Simple high-throughput inference library☆120Updated 2 months ago