stillwater-sc / hpr-blas
High-Performance Reproducible BLAS using posit arithmetic
☆12Updated 2 years ago
Related projects: ⓘ
- Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware☆15Updated 2 years ago
- BLAS implementation for Intel FPGA☆75Updated 3 years ago
- Custom-Precision Floating-point numbers.☆28Updated 3 months ago
- Error-Free Transformations as building blocks for compensated algorithms☆12Updated last year
- ☆16Updated 2 years ago
- TAPA is a dataflow HLS framework that features fast compilation, expressive programming model and generates high-frequency FPGA accelerat…☆19Updated 3 weeks ago
- A GPU performance prediction toolkit for CUDA programs☆16Updated 5 years ago
- Stencil with Optimized Dataflow Architecture☆15Updated 6 months ago
- ☆12Updated last year
- A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https:/…☆17Updated last year
- Orio is an open-source extensible framework for the definition of domain-specific languages and generation of optimized code for multiple…☆36Updated 2 years ago
- MLIR tools and dialect for GraphBLAS☆15Updated 2 years ago
- Example for running IREE in a bare-metal Arm environment.☆22Updated this week
- ☆19Updated 2 weeks ago
- AI Accelerators-SC23-tutorial Repository☆11Updated 10 months ago
- Recursive LAPACK Collection☆42Updated 2 years ago
- FPGA acceleration of arbitrary precision floating point computations.☆34Updated 2 years ago
- SForum 2020 : "A Run-time Hardware Routing Implementation for CGRA Overlays" code and data.☆11Updated 4 years ago
- ☆15Updated 3 years ago
- IP prototyping in FPGA hardware☆18Updated 6 years ago
- A tool for debugging and assessing floating point precision and reproducibility.☆64Updated 4 months ago
- ☆17Updated last year
- Chisel library for Unum Type-III Posit Arithmetic☆30Updated 5 months ago
- Custom BLAS and LAPACK Cross-Compilation Framework for RISC-V☆17Updated 4 years ago
- A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code☆13Updated last year
- Round matrix elements to lower precision in MATLAB☆35Updated 2 years ago
- A polyhedral compiler for hardware accelerators☆55Updated last month
- A domain-specific language and compiler for image processing☆76Updated 3 years ago
- cuASR: CUDA Algebra for Semirings☆30Updated 2 years ago
- Rodinia Benchmark Suite for OpenCL-based FPGAs☆29Updated last year