ROCm / tensorcast
☆11Updated 3 weeks ago
Alternatives and similar repositories for tensorcast
Users that are interested in tensorcast are comparing it to the libraries listed below
Sorting:
- ☆97Updated last week
- Linux docker for the DNN accelerator exploration infrastructure composed of Accelergy and Timeloop☆52Updated last month
- A Winograd Minimal Filter Implementation in CUDA☆24Updated 3 years ago
- FRAME: Fast Roofline Analytical Modeling and Estimation☆34Updated last year
- ☆96Updated last year
- ☆14Updated 3 years ago
- ☆33Updated 3 years ago
- IREE plugin repository for the AMD AIE accelerator☆94Updated this week
- PyTorch emulation library for Microscaling (MX)-compatible data formats☆226Updated last month
- ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines (FPGA 2025 Best Paper Nominee)☆26Updated this week
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆109Updated 2 years ago
- A reference implementation of the Mind Mappings Framework.☆29Updated 3 years ago
- ☆70Updated 5 years ago
- Artifact repository for paper Automatic Generation of High-Performance Quantized Machine Learning Kernels☆17Updated 4 years ago
- Official implementation of EMNLP'23 paper "Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?"☆22Updated last year
- ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference☆115Updated 3 months ago
- Fork of upstream onnxruntime focused on supporting risc-v accelerators☆86Updated 2 years ago
- An open-source parameterizable NPU generator with full-stack multi-target compilation stack for intelligent workloads.☆51Updated 2 months ago
- ☆34Updated 4 years ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆137Updated 2 years ago
- ☆41Updated 10 months ago
- Heron: Automatically Constrained High-Performance Library Generation for Deep Learning Accelerators☆20Updated last year
- A scheduler for spatial DNN accelerators that generate high-performance schedules in one shot using mixed integer programming (MIP)☆79Updated last year
- An FPGA accelerator for general-purpose Sparse-Matrix Dense-Matrix Multiplication (SpMM).☆79Updated 9 months ago
- ☆27Updated 6 months ago
- STONNE: A Simulation Tool for Neural Networks Engines☆131Updated 11 months ago
- High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS☆90Updated 7 months ago
- MAERI: A DNN accelerator with reconfigurable interconnects to support flexible dataflow (http://synergy.ece.gatech.edu/tools/maeri/)☆65Updated 3 years ago
- Official implementation of "Searching for Winograd-aware Quantized Networks" (MLSys'20)☆27Updated last year
- agile hardware-software co-design☆46Updated 3 years ago