argonne-lcf / AIaccelerators-SC23-tutorial
AI Accelerators-SC23-tutorial Repository
☆11Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for AIaccelerators-SC23-tutorial
- A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https:/…☆18Updated last year
- Heterogeneous Accelerated Computed Cluster (HACC) Resources Page☆19Updated 2 weeks ago
- Code base for OOPSLA'24 paper: UniSparse: An Intermediate Language for General Sparse Format Customization☆28Updated 5 months ago
- Multi-target compiler for Sum-Product Networks, based on MLIR and LLVM.☆22Updated this week
- SparseP is the first open-source Sparse Matrix Vector Multiplication (SpMV) software package for real-world Processing-In-Memory (PIM) ar…☆70Updated 2 years ago
- TAPA is a dataflow HLS framework that features fast compilation, expressive programming model and generates high-frequency FPGA accelerat…☆19Updated 2 months ago
- Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware☆15Updated 2 years ago
- ☆41Updated 4 years ago
- Alveo Collective Communication Library: MPI-like communication operations for Xilinx Alveo accelerators☆81Updated 3 weeks ago
- ☆15Updated 3 years ago
- A PIM instrumentation, compilation, execution, simulation, and evaluation repository for BLIMP-style architectures.☆16Updated 2 years ago
- Data-Centric MLIR dialect☆38Updated last year
- Tutorial Material from the SST Team☆18Updated 6 months ago
- FRAME: Fast Roofline Analytical Modeling and Estimation☆31Updated last year
- ☆36Updated this week
- agile hardware-software co-design☆44Updated 2 years ago
- ☆15Updated 3 years ago
- A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.☆45Updated last month
- ☆27Updated 5 years ago
- ☆25Updated last month
- ☆21Updated last month
- Chai☆42Updated 11 months ago
- EQueue Dialect☆39Updated 2 years ago
- Benchmark for measuring the performance of sparse and irregular memory access.☆75Updated 2 weeks ago
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆31Updated 3 years ago
- Polyhedral High-Level Synthesis in MLIR☆29Updated last year
- ☆15Updated this week
- PIM-ML is a benchmark for training machine learning algorithms on the UPMEM architecture, which is the first publicly-available real-worl…☆18Updated last year
- Dynamically Reconfigurable Architecture Template and Cycle-level Microarchitecture Simulator for Dataflow AcCelerators☆28Updated last year
- [FPGA'21] Microbenchmarks for Demystifying the Memory System of Modern Datacenter FPGAs for Software Programmers☆29Updated 2 years ago