pujyam / simianLinks
Simian Process Oriented Conservative JIT PDES from LANL
☆13Updated last month
Alternatives and similar repositories for simian
Users that are interested in simian are comparing it to the libraries listed below
Sorting:
- A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https:/…☆22Updated 2 years ago
- ☆22Updated 11 months ago
- A GPU FP32 computation method with Tensor Cores.☆26Updated 2 months ago
- ETHZ Heterogeneous Accelerated Compute Cluster.☆38Updated 4 months ago
- CAKE Library for constant-bandwidth matrix multiplication on CPUs☆14Updated last year
- ☆40Updated 3 years ago
- GPU Performance Advisor☆65Updated 3 years ago
- Multi-target compiler for Sum-Product Networks, based on MLIR and LLVM.☆25Updated last year
- ☆31Updated 3 years ago
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆32Updated 4 years ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆17Updated 2 years ago
- Heterogeneous Accelerated Computed Cluster (HACC) Resources Page☆22Updated 4 months ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Updated 6 years ago
- ☆17Updated 4 years ago
- GPTPU for SC 2021☆52Updated 2 years ago
- ☆41Updated 4 months ago
- Code released to accompany the ISCA paper: "T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware"☆28Updated 3 years ago
- ☆18Updated 3 years ago
- An MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE☆18Updated 3 years ago
- Polyhedral High-Level Synthesis in MLIR☆35Updated 2 years ago
- Multiple 1-stencil implementations using nvidia cuda.☆13Updated 8 years ago
- ☆13Updated 10 years ago
- A unified programming framework for high and portable performance across FPGAs and GPUs☆11Updated 10 months ago
- A machine learning-based computer architecture simulator.☆24Updated last year
- ☆11Updated 4 years ago
- BEER determines an ECC code's parity-check matrix based on the uncorrectable errors it can cause. BEER targets Hamming codes that are use…☆19Updated 5 years ago
- HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration☆15Updated 5 years ago
- Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware☆15Updated 3 years ago
- ☆31Updated last week
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆27Updated last year