spcl / npbenchLinks
NPBench - A Benchmarking Suite for High-Performance NumPy
☆81Updated last month
Alternatives and similar repositories for npbench
Users that are interested in npbench are comparing it to the libraries listed below
Sorting:
- POC work on MLIR backend☆55Updated 10 months ago
- Collection of scripts to build PyTorch and the domain libraries from source.☆12Updated last week
- Benchmark for measuring the performance of sparse and irregular memory access.☆78Updated last month
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆206Updated last month
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆108Updated 2 years ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆47Updated this week
- JUPITER Benchmark Suite☆17Updated 10 months ago
- Analyze graph/hierarchical performance data using pandas dataframes☆115Updated 4 months ago
- OpenMP Offloading Validation & Verification Suite; Official repository. We have migrated from bitbucket!! For documentation, results, pub…☆58Updated 2 weeks ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆33Updated 2 months ago
- A unified framework across multiple programming platforms☆41Updated 3 weeks ago
- ROCm SPARSE marshalling library☆67Updated this week
- RAJA Performance Suite☆117Updated this week
- Loop Kernel Analysis and Performance Modeling Toolkit☆93Updated 3 months ago
- Distributed Communication-Optimal LU-factorization Algorithm☆12Updated 3 years ago
- Data Parallel Extension for Numba☆81Updated 7 months ago
- ☆60Updated last month
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆56Updated 2 months ago
- Python interface for the LIKWID C API (https://github.com/RRZE-HPC/likwid)☆46Updated this week
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆77Updated 3 weeks ago
- Error-Free Transformations as building blocks for compensated algorithms☆15Updated 2 years ago
- Advanced Profiling and Analytics for AMD Hardware☆157Updated this week
- Using C++ magic to launch/capture CUDA kernels and tune them with Kernel Tuner☆20Updated last year
- Graph-indexed Pandas DataFrames for analyzing hierarchical performance data☆33Updated last week
- A tracing infrastructure for heterogeneous computing applications.☆33Updated this week
- The CUDA target for Numba☆140Updated this week
- ytopt: machine-learning-based autotuning and hyperparameter optimization framework using Bayesian Optimization☆48Updated last week
- A lightweight, Pythonic, frontend for MLIR☆81Updated last year
- TAU Performance System Public Mirror (Updated every night at midnight, USA Pacific Time)☆48Updated last week
- Python wrapper for isl, an integer set library☆77Updated last week