zohourih / Diffusion_FPGA
Highly-optimized spatially and temporally-blocked implementation of Diffusion 2D and 3D stencils for Intel FPGAs using OpenCL
☆12Updated last year
Alternatives and similar repositories for Diffusion_FPGA:
Users that are interested in Diffusion_FPGA are comparing it to the libraries listed below
- A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https:/…☆19Updated last year
- Heterogeneous Accelerated Computed Cluster (HACC) Resources Page☆21Updated last week
- XRM (Xilinx FPGA Resource Manager) Document:☆25Updated last year
- Fork of upstream onnxruntime focused on supporting risc-v accelerators☆84Updated 2 years ago
- A Toy-Purpose TPU Simulator☆18Updated 10 months ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆18Updated 2 years ago
- HW/SW co-design of sentence-level energy optimizations for latency-aware multi-task NLP inference☆46Updated last year
- The Riallto Open Source Project from AMD☆77Updated 2 weeks ago
- ☆12Updated 3 years ago
- Provides the code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators" by Luk…☆17Updated 5 years ago
- Alveo Collective Communication Library: MPI-like communication operations for Xilinx Alveo accelerators☆89Updated last month
- TensorCore Vector Processor for Deep Learning - Google Summer of Code Project☆21Updated 3 years ago
- FPGA version of Rodinia in HLS C/C++☆35Updated 4 years ago
- Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware☆16Updated 3 years ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆26Updated 4 months ago
- For CPU experiment☆9Updated 4 years ago
- FlexGripPlus: an open-source GPU model for reliability evaluation and micro architectural simulation☆99Updated last year
- A polyhedral compiler for hardware accelerators☆56Updated 9 months ago
- ☆14Updated 3 years ago
- Systolic Three Matrix Multiplier for Graph Convolutional Networks using High Level Synthesis☆22Updated 2 years ago
- Example for running IREE in a bare-metal Arm environment.☆33Updated last month
- Accelerator simulation framework using nn_dataflow traces and energy, etc. post-processing☆7Updated 6 years ago
- SMASH is a hardware-software cooperative mechanism that enables highly-efficient indexing and storage of sparse matrices. The key idea of…☆16Updated 4 years ago
- PLCT实验室 rvv-llvm 实现配套的 benchmark / testcases☆22Updated 4 years ago
- FPGA acceleration of arbitrary precision floating point computations.☆38Updated 2 years ago
- GPTPU for SC 2021☆51Updated 2 years ago
- cycle accurate Network-on-Chip Simulator☆27Updated 2 years ago
- ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines (FPGA 2025 Best Paper Nominee)☆22Updated this week
- Multi-target compiler for Sum-Product Networks, based on MLIR and LLVM.☆23Updated 4 months ago
- Learn NVDLA by SOMNIA☆33Updated 5 years ago