Multi-core HW accelerator mapping optimization framework for layer-fused ML workloads.
☆64Jul 5, 2025Updated 8 months ago
Alternatives and similar repositories for stream
Users that are interested in stream are comparing it to the libraries listed below
Sorting:
- HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators☆185Jan 23, 2026Updated last month
- A framework for fast exploration of the depth-first scheduling space for DNN accelerators☆43Feb 8, 2023Updated 3 years ago
- Heterogeneous Cluster Interconnect to bind special-purpose HW accelerators with general-purpose cluster cores☆14Feb 27, 2026Updated last week
- Model LLM inference on single-core dataflow accelerators☆18Dec 16, 2025Updated 2 months ago
- Provides the hardware code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerator…☆25Jul 14, 2020Updated 5 years ago
- Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.☆459Feb 19, 2026Updated 2 weeks ago
- Accelergy is an energy estimation infrastructure for accelerator energy estimations☆156May 26, 2025Updated 9 months ago
- Exercises for exploring the Fibertree, Timeloop and Accelergy tools☆116Apr 9, 2025Updated 10 months ago
- A heterogeneous accelerator-centric compute cluster☆33Feb 26, 2026Updated last week
- The wafer-native AI accelerator simulation platform and inference engine.☆50Jan 1, 2026Updated 2 months ago
- Wraps the NVDLA project for Chipyard integration☆22Sep 2, 2025Updated 6 months ago
- ☆72Feb 16, 2023Updated 3 years ago
- A systolic array simulator for multi-cycle MACs and varying-byte words, with the paper accepted to HPCA 2022.☆85Nov 7, 2021Updated 4 years ago
- Repository to host and maintain SCALE-Sim code☆417Feb 2, 2026Updated last month
- FPGA acceleration of arbitrary precision floating point computations.☆40May 17, 2022Updated 3 years ago
- The official implementation of HPCA 2025 paper, Prosperity: Accelerating Spiking Neural Networks via Product Sparsity☆37Aug 9, 2025Updated 6 months ago
- ☆35Mar 1, 2019Updated 7 years ago
- SAMO: Streaming Architecture Mapping Optimisation☆34Oct 4, 2023Updated 2 years ago
- Provides the code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators" by Luk…☆19Oct 6, 2019Updated 6 years ago
- Linux docker for the DNN accelerator exploration infrastructure composed of Accelergy and Timeloop☆64Oct 14, 2025Updated 4 months ago
- NeuraLUT-Assemble☆47Aug 20, 2025Updated 6 months ago
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆26Jun 16, 2025Updated 8 months ago
- Machine-Learning Accelerator System Exploration Tools☆198Feb 24, 2026Updated last week
- Pulp virtual platform☆24Jul 16, 2025Updated 7 months ago
- Open-source Framework for HPCA2024 paper: Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators☆110Apr 28, 2025Updated 10 months ago
- A tool to deploy Deep Neural Networks on PULP-based SoC's☆93Aug 4, 2025Updated 7 months ago
- HW accelerator mapping optimization framework for in-memory computing☆28Jun 3, 2025Updated 9 months ago
- ☆20Feb 25, 2026Updated last week
- Multimedia SoC Design with Specialization on Application Acceleration with High-Level-Synthesis [2020 Fall]☆12Jun 15, 2021Updated 4 years ago
- Training Quantized Neural Networks with a Full-precision Auxiliary Module☆13Jun 19, 2020Updated 5 years ago
- ☆13Oct 26, 2023Updated 2 years ago
- Fast Emulation of Approximate DNN Accelerators in PyTorch☆30Feb 23, 2024Updated 2 years ago
- PyTorch model to RTL flow for low latency inference☆131Mar 15, 2024Updated last year
- HW/SW co-design of sentence-level energy optimizations for latency-aware multi-task NLP inference☆54Mar 24, 2024Updated last year
- The framework for the paper "Inter-layer Scheduling Space Definition and Exploration for Tiled Accelerators" in ISCA 2023.☆82Mar 12, 2025Updated 11 months ago
- [ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better☆16Feb 15, 2025Updated last year
- ☆11Mar 16, 2022Updated 3 years ago
- Low level design of a chip built for optimizing/accelerating CNN classifiers over gray scale images.☆13May 14, 2019Updated 6 years ago
- PLCT实验室2019年开放日资料(OpenDay-2019)☆11Dec 20, 2019Updated 6 years ago