pujyam / simianLinks
Simian Process Oriented Conservative JIT PDES from LANL
☆11Updated 3 years ago
Alternatives and similar repositories for simian
Users that are interested in simian are comparing it to the libraries listed below
Sorting:
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Updated 6 years ago
- ETHZ Heterogeneous Accelerated Compute Cluster.☆36Updated 3 months ago
- A GPU FP32 computation method with Tensor Cores.☆20Updated 2 years ago
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆25Updated 8 months ago
- Multiple 1-stencil implementations using nvidia cuda.☆13Updated 7 years ago
- A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https:/…☆19Updated last year
- A unified programming framework for high and portable performance across FPGAs and GPUs☆11Updated 3 months ago
- Code released to accompany the ISCA paper: "T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware"☆28Updated 3 years ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆18Updated 2 years ago
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19Updated last year
- SpV8 is a SpMV kernel written in AVX-512. Artifact for our SpV8 paper @ DAC '21.☆28Updated 4 years ago
- ☆21Updated 4 months ago
- Scalable GPU Kernel Fission/Fusion Transformation for Memory-Bound Kernels☆13Updated 9 years ago
- An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3☆27Updated 4 years ago
- ☆37Updated 3 years ago
- HeteroCL-MLIR dialect for accelerator design☆41Updated 9 months ago
- ☆17Updated 3 years ago
- CAKE Library for constant-bandwidth matrix multiplication on CPUs☆15Updated last year
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆32Updated 4 years ago
- A machine learning-based computer architecture simulator.☆22Updated 10 months ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆30Updated 9 months ago
- Heterogeneous Accelerated Computed Cluster (HACC) Resources Page☆21Updated last week
- Performance Prediction Toolkit☆52Updated 6 months ago
- ☆31Updated 3 years ago
- A Benchmark Suite for Heterogeneous System Computation☆53Updated 4 months ago
- Multi-target compiler for Sum-Product Networks, based on MLIR and LLVM.☆23Updated 7 months ago
- Chai☆44Updated last year
- ☆40Updated this week
- ☆13Updated 10 years ago
- Polyhedral High-Level Synthesis in MLIR☆33Updated 2 years ago