tenstorrent / tt-studio
TT-Studio : An all-in-one platform to deploy and manage AI models optimized for Tenstorrent hardware with dedicated front-end demo applications.
☆14Updated this week
Alternatives and similar repositories for tt-studio:
Users that are interested in tt-studio are comparing it to the libraries listed below
- Tenstorrent console based hardware information program☆35Updated last week
- A comprehensive tool for visualizing and analyzing model execution, offering interactive graphs, memory plots, tensor details, buffer ove…☆27Updated this week
- Repository of model demos using TT-Buda☆63Updated 2 weeks ago
- Tenstorrent MLIR compiler☆109Updated this week
- The TT-Forge FE is a graph compiler designed to optimize and transform computational graphs for deep learning models, enhancing their per…☆30Updated this week
- High-Performance SGEMM on CUDA devices☆88Updated 2 months ago
- Custom PTX Instruction Benchmark☆120Updated last month
- Attention in SRAM on Tenstorrent Grayskull☆32Updated 8 months ago
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆12Updated this week
- An experimental CPU backend for Triton☆103Updated this week
- Tenstorrent TT-BUDA Repository☆307Updated 2 weeks ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆40Updated 2 weeks ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆35Updated this week
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆26Updated last week
- RDNA3 emulator☆54Updated this week
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆108Updated 3 weeks ago
- IREE's PyTorch Frontend, based on Torch Dynamo.☆74Updated this week
- AI Assistant running within your browser.☆62Updated 4 months ago
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆99Updated last month
- Nvidia Instruction Set Specification Generator☆253Updated 8 months ago
- Tenstorrent Kernel Module☆39Updated this week
- QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.☆24Updated 2 weeks ago
- extensible collectives library in triton☆84Updated this week
- ☆17Updated 2 weeks ago
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆55Updated last week
- Fast low-bit matmul kernels in Triton☆275Updated this week
- Training Models Daily☆17Updated last year
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆105Updated 5 months ago
- Fast Matrix Multiplications for Lookup Table-Quantized LLMs☆236Updated last month
- ☆157Updated this week