tenstorrent / tt-buda-demosLinks
Repository of model demos using TT-Buda
☆63Updated 9 months ago
Alternatives and similar repositories for tt-buda-demos
Users that are interested in tt-buda-demos are comparing it to the libraries listed below
Sorting:
- Tenstorrent TT-BUDA Repository☆314Updated 9 months ago
- [Deprecated] ⭐️ TT-NN Compiler for PyTorch 2 ⭐️ Enables running PyTorch models on Tenstorrent hardware using eager or compile path☆61Updated 2 weeks ago
- Tenstorrent console based hardware information program☆58Updated this week
- The Riallto Open Source Project from AMD☆83Updated 9 months ago
- Attention in SRAM on Tenstorrent Grayskull☆40Updated last year
- Buda Compiler Backend for Tenstorrent devices☆30Updated 9 months ago
- Tenstorrent Kernel Module☆57Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆113Updated this week
- Repository for AI model benchmarking on TT-Buda☆15Updated 10 months ago
- AMD related optimizations for transformer models☆97Updated 3 months ago
- TVM for Tenstorrent ASICs☆28Updated 4 months ago
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆171Updated this week
- Tenstorrent Firmware repository☆23Updated 2 weeks ago
- ☆60Updated 2 years ago
- Development repository for the Triton language and compiler☆140Updated last week
- ☆91Updated last week
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆114Updated 5 months ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆48Updated 5 months ago
- High-Performance SGEMM on CUDA devices☆115Updated 11 months ago
- Tenstorrent Firmware Update Utility☆10Updated last week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆93Updated this week
- The TT-Forge FE is a graph compiler designed to optimize and transform computational graphs for deep learning models, enhancing their per…☆53Updated this week
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆40Updated 5 months ago
- QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX☆170Updated last week
- GPUOcelot: A dynamic compilation framework for PTX☆219Updated 11 months ago
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆172Updated last year
- Ahead of Time (AOT) Triton Math Library☆87Updated this week
- Tenstorrent MLIR compiler☆233Updated this week
- IREE plugin repository for the AMD AIE accelerator☆117Updated last week
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year