tenstorrent / tt-buda-demosLinks
Repository of model demos using TT-Buda
☆62Updated 4 months ago
Alternatives and similar repositories for tt-buda-demos
Users that are interested in tt-buda-demos are comparing it to the libraries listed below
Sorting:
- ⭐️ TTNN Compiler for PyTorch 2 ⭐️ Enables running PyTorch models on Tenstorrent hardware using eager or compile path☆53Updated last week
- Tenstorrent TT-BUDA Repository☆315Updated 4 months ago
- Tenstorrent console based hardware information program☆52Updated last week
- TT-Studio : An all-in-one platform to deploy and manage AI models optimized for Tenstorrent hardware with dedicated front-end demo applic…☆32Updated this week
- Attention in SRAM on Tenstorrent Grayskull☆38Updated last year
- Buda Compiler Backend for Tenstorrent devices☆30Updated 4 months ago
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆102Updated this week
- Tenstorrent Kernel Module☆51Updated this week
- Tenstorrent Firmware repository☆19Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆111Updated this week
- The Riallto Open Source Project from AMD☆82Updated 4 months ago
- The TT-Forge FE is a graph compiler designed to optimize and transform computational graphs for deep learning models, enhancing their per…☆49Updated this week
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆44Updated last week
- Repository for AI model benchmarking on TT-Buda☆15Updated 5 months ago
- Tenstorrent MLIR compiler☆174Updated this week
- Repo for AI Compiler team. The intended purpose of this repo is for implementation of a PJRT device.☆21Updated this week
- ☆58Updated this week
- AMD related optimizations for transformer models☆83Updated last week
- High-Performance SGEMM on CUDA devices☆97Updated 7 months ago
- TVM for Tenstorrent ASICs☆26Updated this week
- Development repository for the Triton language and compiler☆127Updated this week
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆159Updated last year
- Custom PTX Instruction Benchmark☆126Updated 6 months ago
- User-Mode Driver for Tenstorrent hardware☆31Updated last week
- No-code CLI designed for accelerating ONNX workflows☆208Updated 2 months ago
- GPUOcelot: A dynamic compilation framework for PTX☆207Updated 6 months ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆112Updated 3 weeks ago
- Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…☆77Updated 2 weeks ago
- AMD's graph optimization engine.☆240Updated this week
- RDNA3 emulator☆54Updated 4 months ago