arakhmati / torchtrailLinks
torchtrail: trace the graph of torch functions and modules for visualization, reports, etc
☆25Updated last month
Alternatives and similar repositories for torchtrail
Users that are interested in torchtrail are comparing it to the libraries listed below
Sorting:
- High-Performance SGEMM on CUDA devices☆96Updated 5 months ago
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆72Updated this week
- Tenstorrent MLIR compiler☆141Updated this week
- seqax = sequence modeling + JAX☆162Updated 2 weeks ago
- Explore training for quantized models☆18Updated last week
- ⭐️ TTNN Compiler for PyTorch 2 ⭐️ Enables running PyTorch models on Tenstorrent hardware using eager or compile path☆47Updated this week
- extensible collectives library in triton☆86Updated 2 months ago
- TT-NN operator library, and TT-Metalium low level kernel programming model.☆942Updated this week
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆276Updated 3 weeks ago
- Make triton easier☆46Updated last year
- ☆52Updated 10 months ago
- LLM training in simple, raw C/CUDA☆99Updated last year
- Attention in SRAM on Tenstorrent Grayskull☆36Updated 11 months ago
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆61Updated 5 months ago
- JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training☆51Updated last month
- Repo for AI Compiler team. The intended purpose of this repo is for implementation of a PJRT device.☆18Updated this week
- A place to store reusable transformer components of my own creation or found on the interwebs☆56Updated last week
- Tenstorrent TT-BUDA Repository☆313Updated 2 months ago
- ☆12Updated 3 weeks ago
- Experiment of using Tangent to autodiff triton☆79Updated last year
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆114Updated this week
- MLIR-based partitioning system☆97Updated this week
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Updated 3 months ago
- ☆109Updated 3 months ago
- PyTorch centric eager mode debugger☆47Updated 6 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆45Updated 11 months ago
- IREE's PyTorch Frontend, based on Torch Dynamo.☆87Updated this week
- ☆221Updated this week
- ☆318Updated last week
- Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …☆106Updated 5 months ago