tenstorrent / tt-metal
TT-NN operator library, and TT-Metalium low level kernel programming model.
☆475Updated this week
Related projects ⓘ
Alternatives and complementary repositories for tt-metal
- Tenstorrent TT-BUDA Repository☆225Updated last month
- Tenstorrent MLIR compiler☆75Updated this week
- ⭐️ TTNN Compiler for PyTorch 2.0 ⭐️ It enables running PyTorch2.0 models on Tenstorrent hardware☆25Updated this week
- Backward compatible ML compute opset inspired by HLO/MHLO☆412Updated this week
- Repository of model demos using TT-Buda☆55Updated 2 weeks ago
- An MLIR-based toolchain for AMD AI Engine-enabled devices.☆307Updated this week
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆313Updated this week
- Buda Compiler Backend for Tenstorrent devices☆26Updated last month
- Shared Middle-Layer for Triton Compilation☆191Updated this week
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆95Updated last week
- This is the top-level repository for the Accel-Sim framework.☆305Updated 3 weeks ago
- HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Heterogeneous Computing☆326Updated 7 months ago
- GPUOcelot: A dynamic compilation framework for PTX☆147Updated last month
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆124Updated this week
- Development repository for the Triton language and compiler☆93Updated this week
- IREE's PyTorch Frontend, based on Torch Dynamo.☆55Updated this week
- TVM for Tenstorrent ASICs☆20Updated this week
- Berkeley's Spatial Array Generator☆815Updated this week
- Open, Modular, Deep Learning Accelerator☆254Updated 7 months ago
- Awesome resources for GPUs☆492Updated last year
- Stores documents and resources used by the OpenXLA developer community☆107Updated 3 months ago
- Nvidia Instruction Set Specification Generator☆216Updated 4 months ago
- Fast CUDA matrix multiplication from scratch☆479Updated 10 months ago
- Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure☆767Updated this week
- PyTorch emulation library for Microscaling (MX)-compatible data formats☆163Updated last month
- A scalable High-Level Synthesis framework on MLIR☆228Updated 6 months ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆405Updated last year
- BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.☆420Updated this week
- IREE plugin repository for the AMD AIE accelerator☆69Updated this week
- OpenAI Triton backend for Intel® GPUs☆143Updated this week