albertozeni / starlight
Starlight: A Kernel Optimizer for GPU Processing
☆13Updated 8 months ago
Related projects: ⓘ
- A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https:/…☆17Updated last year
- AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP (Full Paper a…☆19Updated last month
- Image Registration on FPGAs☆19Updated 2 years ago
- FRAME: Fast Roofline Analytical Modeling and Estimation☆28Updated 11 months ago
- Code base for OOPSLA'24 paper: UniSparse: An Intermediate Language for General Sparse Format Customization☆28Updated 3 months ago
- SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)☆23Updated 2 months ago
- SparseP is the first open-source Sparse Matrix Vector Multiplication (SpMV) software package for real-world Processing-In-Memory (PIM) ar…☆71Updated 2 years ago
- A high-level performance analysis tool for FPGA-based accelerators☆18Updated 7 years ago
- ☆27Updated 5 years ago
- Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs☆21Updated 2 months ago
- MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine (accepted as full paper at FPT'23)☆14Updated 5 months ago
- HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration☆13Updated 4 years ago
- GPTPU for SC 2021☆46Updated last year
- A binary instrumentation tool to analyze load instructions in any off-the-shelf x86(-64) program. Described by Bera et al. in https://arx…☆12Updated 2 months ago
- Adaptive floating-point based numerical format for resilient deep learning☆14Updated 2 years ago
- ETHZ Heterogeneous Accelerated Compute Cluster.☆28Updated 3 weeks ago
- ☆10Updated 10 months ago
- Heterogeneous Accelerated Computed Cluster (HACC) Resources Page☆19Updated last week
- ☆38Updated last week
- A graph linear algebra overlay☆47Updated last year
- FPGA version of Rodinia in HLS C/C++☆32Updated 3 years ago
- ☆76Updated this week
- SAURIA (Systolic-Array tensor Unit for aRtificial Intelligence Acceleration) is an open-source Convolutional Neural Network accelerator b…☆17Updated 2 months ago
- ☆15Updated 3 years ago
- A Toy-Purpose TPU Simulator☆10Updated 3 months ago
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆29Updated last month
- PIM-ML is a benchmark for training machine learning algorithms on the UPMEM architecture, which is the first publicly-available real-worl…☆15Updated last year
- Public repostory for the DAC 2021 paper "Scaling up HBM Efficiency of Top-K SpMV forApproximate Embedding Similarity on FPGAs"☆14Updated 3 years ago
- Heterogenous ML accelerator☆15Updated 5 months ago
- ACM TODAES Best Paper Award, 2022☆23Updated 10 months ago