arakhmati / torchtrail
torchtrail: trace the graph of torch functions and modules for visualization, reports, etc
☆25Updated 9 months ago
Alternatives and similar repositories for torchtrail:
Users that are interested in torchtrail are comparing it to the libraries listed below
- TT-NN operator library, and TT-Metalium low level kernel programming model.☆662Updated this week
- LLM training in simple, raw C/CUDA☆92Updated 10 months ago
- extensible collectives library in triton☆83Updated 5 months ago
- High-Performance SGEMM on CUDA devices☆86Updated last month
- ☆188Updated 3 weeks ago
- ⭐️ TTNN Compiler for PyTorch 2.0 ⭐️ It enables running PyTorch2.0 models on Tenstorrent hardware☆30Updated this week
- ☆187Updated 8 months ago
- ☆15Updated 5 months ago
- Make triton easier☆47Updated 9 months ago
- Boosting 4-bit inference kernels with 2:4 Sparsity☆67Updated 6 months ago
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆100Updated 8 months ago
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆58Updated last month
- Unit Scaling demo and experimentation code☆16Updated last year
- Fast low-bit matmul kernels in Triton☆257Updated last week
- ☆21Updated last week
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆40Updated last year
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆39Updated 10 months ago
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆125Updated last year
- A comprehensive tool for visualizing and analyzing model execution, offering interactive graphs, memory plots, tensor details, buffer ove…☆26Updated this week
- Fast Matrix Multiplications for Lookup Table-Quantized LLMs☆231Updated 2 weeks ago
- OpenAI Triton backend for Intel® GPUs☆168Updated this week
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆43Updated 7 months ago
- PyTorch emulation library for Microscaling (MX)-compatible data formats☆207Updated 5 months ago
- Attention in SRAM on Tenstorrent Grayskull☆32Updated 7 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆81Updated this week
- ☆73Updated 4 months ago