PArallelLOOPgEneratoR: Threaded Loops Code Generation Infrastructure targeting Tensor Contraction Applications such as GEMMs, Convolutions and Fused Deep Learning Primitives
☆19Jan 22, 2026Updated last month
Alternatives and similar repositories for parlooper
Users that are interested in parlooper are comparing it to the libraries listed below
Sorting:
- Intel® Tensor Processing Primitives extension for Pytorch*☆18Feb 23, 2026Updated last week
- ☆13Jan 8, 2020Updated 6 years ago
- ☆11Mar 14, 2023Updated 2 years ago
- Mathematics.NET is a C# class library that provides tools for solving advanced mathematical problems.☆16Feb 20, 2026Updated last week
- Artifact associated with CHES 2022 paper https://tches.iacr.org/index.php/TCHES/article/view/9817☆12Nov 10, 2023Updated 2 years ago
- Implementation of a Systolic Array based sorting engine on an FPGA using Verilog☆11May 11, 2017Updated 8 years ago
- 从湖南大学教务系统中导出 ics 课表☆12Feb 21, 2020Updated 6 years ago
- ☆12Mar 28, 2023Updated 2 years ago
- Matrix Accelerator Generator for GeMM Operations based on SIGMA Architecture in CHISEL HDL☆15Mar 21, 2024Updated last year
- ADNet Implementation using Tensorflow☆10Mar 28, 2020Updated 5 years ago
- A Prot paper related materials☆11Sep 5, 2022Updated 3 years ago
- ☆13May 25, 2022Updated 3 years ago
- slic video segmentation☆10Mar 8, 2015Updated 10 years ago
- OpenVINO LLM Benchmark☆11Dec 7, 2023Updated 2 years ago
- Fast and memory-efficient exact attention☆15Feb 13, 2026Updated 2 weeks ago
- SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization☆11Aug 12, 2020Updated 5 years ago
- ☆11Apr 16, 2023Updated 2 years ago
- ☆12Aug 26, 2025Updated 6 months ago
- NeuraChip Accelerator Simulator☆16Apr 26, 2024Updated last year
- Stencil with Optimized Dataflow Architecture☆12Feb 27, 2024Updated 2 years ago
- 降噪算法NL-means的C++实现☆11Mar 11, 2019Updated 6 years ago
- Flexible memory allocation tool for multi-tiered memory systems☆13Jan 7, 2026Updated last month
- A Fast Graph Update Library for FPGA-based Dynamic Graph Processing☆10Dec 20, 2021Updated 4 years ago
- ☆15Mar 24, 2023Updated 2 years ago
- Fibertree emulator☆17Nov 4, 2024Updated last year
- Arrow Matrix Decomposition - Communication-Efficient Distributed Sparse Matrix Multiplication☆15Mar 25, 2024Updated last year
- OpenGL学习实例☆11Apr 6, 2019Updated 6 years ago
- ☆12Apr 16, 2022Updated 3 years ago
- Systolic array based hardware for Image processing on the SPARTAN-6 FPGA☆13May 26, 2016Updated 9 years ago
- Graph accelerator on FPGAs and ASICs☆11Aug 16, 2018Updated 7 years ago
- HLS project modeling various sparse accelerators.☆12Jan 11, 2022Updated 4 years ago
- including compiler to encode DGL GNN model to instructions, runtime software to transfer data and control the accelerator, and hardware v…☆14Nov 19, 2023Updated 2 years ago
- ☆10Sep 14, 2023Updated 2 years ago
- RESPECT: Reinforcement Learning based Edge Scheduling on Pipelined Coral Edge TPUs (DAC'23)☆11Apr 13, 2023Updated 2 years ago
- A collection of publications on low illuminations image enhancement☆11Jun 4, 2020Updated 5 years ago
- iOS Memory scan tool☆11Jun 13, 2017Updated 8 years ago
- ☆14Apr 8, 2025Updated 10 months ago
- ☆11Feb 13, 2025Updated last year
- A conda-smithy repository for ambertools.☆11Feb 26, 2025Updated last year