SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs
☆18May 23, 2024Updated last year
Alternatives and similar repositories for SDA_code
Users that are interested in SDA_code are comparing it to the libraries listed below
Sorting:
- [HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design☆130Jun 27, 2023Updated 2 years ago
- ☆13Apr 15, 2025Updated 11 months ago
- An efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences☆31Mar 7, 2024Updated 2 years ago
- ☆30Apr 26, 2019Updated 6 years ago
- 从零快速使用Ubuntu,搭建深度学习环境,持续更新中☆10Apr 18, 2023Updated 2 years ago
- ☆14Aug 1, 2024Updated last year
- ☆14May 23, 2024Updated last year
- [TRETS 2025][FPGA 2024] FPGA Accelerator for Imbalanced SpMV using HLS☆20Aug 24, 2025Updated 6 months ago
- ☆53Aug 28, 2024Updated last year
- ☆16Jul 1, 2024Updated last year
- [FPGA'21] Microbenchmarks for Demystifying the Memory System of Modern Datacenter FPGAs for Software Programmers☆31Dec 16, 2021Updated 4 years ago
- ☆17Nov 20, 2022Updated 3 years ago
- ☆16Feb 3, 2022Updated 4 years ago
- Simulator for BitFusion☆101Aug 6, 2020Updated 5 years ago
- Scraping repository of the most relevant topics with regards to Spatio-Temporal Neural Networks available in the arXiv archive. The repos…☆15Updated this week
- LLM-Aided FPGA Design and Debug Flow☆24Aug 1, 2025Updated 7 months ago
- ☆49Apr 22, 2021Updated 4 years ago
- Unofficial PyTorch implementation of the paper "Conditional Channel Gated Networks for Task-Aware Continual Learning"☆20Jan 22, 2021Updated 5 years ago
- CNN simd based accelerator using Vitis HLS☆11Jul 15, 2022Updated 3 years ago
- A Fast DNN Accelerator Design Space Exploration Framework.☆46Aug 10, 2022Updated 3 years ago
- HLS implemented systolic array structure☆41Nov 13, 2017Updated 8 years ago
- Kratos: An FPGA Benchmark for Unrolled Deep Neural Networks with Fine-Grained Sparsity and Mixed Precision☆12Jan 19, 2026Updated 2 months ago
- Song collection for Ultrastar Deluxe☆24Apr 16, 2021Updated 4 years ago
- (Verilog) A simple convolution layer implementation with systolic array structure☆13May 9, 2022Updated 3 years ago
- The official implementation of the DAC 2024 paper GQA-LUT☆21Dec 20, 2024Updated last year
- Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs☆54Mar 13, 2026Updated last week
- Allo Accelerator Design and Programming Framework (PLDI'24)☆361Mar 13, 2026Updated last week
- Instance segmentation of center pivot irrigation system in Brazil using Landsat images and Convolutional Neural Network☆11May 27, 2024Updated last year
- High-level synthesis (HLS) implementation of Sparse Matrix Vector Multiplication☆19Feb 17, 2022Updated 4 years ago
- A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code☆15Mar 19, 2023Updated 3 years ago
- The official implementation of the NeurIPS 2022 paper Q-ViT.☆105May 22, 2023Updated 2 years ago
- Implementation of convolution layer in different flavors☆68Oct 8, 2017Updated 8 years ago
- This is a 4*5 PE array for LeNet accelerator based on FPGA.☆13Jul 20, 2022Updated 3 years ago
- HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators☆186Jan 23, 2026Updated last month
- Artifact for IPDPS'21: DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.☆13Apr 6, 2021Updated 4 years ago
- This repository contains papers for a comprehensive survey on accelerated generation techniques in Large Language Models (LLMs).☆11May 24, 2024Updated last year
- ☆21Sep 29, 2025Updated 5 months ago
- Training wide residual networks for deployment using a single bit for each weight - Official Code Repository for ICLR 2018 Published Pape…☆36May 27, 2020Updated 5 years ago
- Implementation of the Winograd algorithm.☆24Nov 6, 2018Updated 7 years ago