ROCm / tensorcastLinks
☆11Updated last month
Alternatives and similar repositories for tensorcast
Users that are interested in tensorcast are comparing it to the libraries listed below
Sorting:
- ☆100Updated this week
- PyTorch emulation library for Microscaling (MX)-compatible data formats☆241Updated last week
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆59Updated last year
- IREE plugin repository for the AMD AIE accelerator☆97Updated this week
- Linux docker for the DNN accelerator exploration infrastructure composed of Accelergy and Timeloop☆52Updated last month
- A Winograd Minimal Filter Implementation in CUDA☆24Updated 3 years ago
- ☆97Updated last year
- PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity☆108Updated last week
- ☆30Updated last week
- ☆30Updated 2 years ago
- An open-source parameterizable NPU generator with full-stack multi-target compilation stack for intelligent workloads.☆53Updated 2 months ago
- ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference☆120Updated 3 months ago
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆89Updated 2 years ago
- A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores☆51Updated last year
- ☆98Updated last year
- ☆14Updated 3 years ago
- Official implementation of EMNLP'23 paper "Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?"☆22Updated last year
- Singular Binarized Neural Network based on GPU Bit Operations (see our SC-19 paper)☆15Updated 4 years ago
- ☆18Updated 3 years ago
- ☆34Updated 4 years ago
- ☆33Updated 3 years ago
- ☆149Updated 2 years ago
- The Riallto Open Source Project from AMD☆80Updated last month
- agile hardware-software co-design☆48Updated 3 years ago
- Serpens is an HBM FPGA accelerator for SpMV☆19Updated 10 months ago
- A tool to deploy Deep Neural Networks on PULP-based SoC's☆80Updated 3 months ago
- ☆11Updated last year
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆110Updated 2 years ago
- HyFiSS: A Hybrid Fidelity Stall-Aware Simulator for GPGPUs☆34Updated 5 months ago
- Multi-core HW accelerator mapping optimization framework for layer-fused ML workloads.☆54Updated last month