tenstorrent / tt-inference-serverLinks
☆44Updated last week
Alternatives and similar repositories for tt-inference-server
Users that are interested in tt-inference-server are comparing it to the libraries listed below
Sorting:
- The TT-Forge ONNX is a graph compiler designed to optimize and transform computational graphs for deep learning models, enhancing their p…☆53Updated last week
- TT-Studio : An all-in-one platform to deploy and manage AI models optimized for Tenstorrent hardware with dedicated front-end demo applic…☆39Updated last week
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆178Updated last week
- Tenstorrent MLIR compiler☆243Updated this week
- Tenstorrent Kernel Module☆57Updated last week
- Attention in SRAM on Tenstorrent Grayskull☆40Updated last year
- Tenstorrent TT-BUDA Repository☆314Updated 9 months ago
- ☆28Updated 10 months ago
- IREE plugin repository for the AMD AIE accelerator☆119Updated this week
- TVM for Tenstorrent ASICs☆28Updated 4 months ago
- AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming☆164Updated last week
- GPUOcelot: A dynamic compilation framework for PTX☆219Updated 11 months ago
- Tenstorrent console based hardware information program☆58Updated this week
- AI Tensor Engine for ROCm☆344Updated last week
- ☆122Updated last week
- An experimental CPU backend for Triton☆173Updated 2 months ago
- QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.☆36Updated 5 months ago
- Repository for AI model benchmarking on TT-Buda☆15Updated 11 months ago
- IREE's PyTorch Frontend, based on Torch Dynamo.☆105Updated last week
- MLIR-based partitioning system☆162Updated this week
- ☆59Updated this week
- OpenAI Triton backend for Intel® GPUs☆225Updated this week
- Development repository for the Triton language and compiler☆140Updated this week
- A framework that support executing unmodified CUDA source code on non-NVIDIA devices.☆141Updated last year
- Repo for AI Compiler team. The intended purpose of this repo is for implementation of a PJRT device.☆51Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆27Updated last week
- ☆164Updated this week
- Unofficial description of the CUDA assembly (SASS) instruction sets.☆198Updated 6 months ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆48Updated 5 months ago
- Fork of LLVM to support AMD AIEngine processors☆187Updated this week