IntelPython / dpbench
Benchmark suite to evaluate Data Parallel Extensions for Python
☆17Updated 2 weeks ago
Related projects: ⓘ
- Python SYCL bindings and SYCL-based Python Array API library☆99Updated this week
- Data Parallel Extension for Numba☆75Updated this week
- Data Parallel Extension for NumPy☆97Updated this week
- Includes Python bindings to instrumentation and tracing technology (ITT) APIs for VTune☆25Updated 8 months ago
- NPBench - A Benchmarking Suite for High-Performance NumPy☆73Updated 3 months ago
- NVIDIA curated collection of educational resources related to general purpose GPU programming.☆58Updated last month
- oneAPI Technical Advisory Board (TAB) Meeting Notes☆71Updated 7 months ago
- Analyze graph/hierarchical performance data using pandas dataframes☆105Updated last month
- Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…☆196Updated last week
- Advanced Profiling and Analytics for AMD Hardware☆132Updated last week
- The Foundation for All Legate Libraries☆186Updated last week
- POC work on MLIR backend☆46Updated 3 weeks ago
- ☆28Updated this week
- Python interface for the LIKWID C API (https://github.com/RRZE-HPC/likwid)☆43Updated last year
- Next generation LAPACK implementation for ROCm platform☆91Updated this week
- High Performance Linpack for Next-Generation AMD HPC Accelerators☆41Updated last week
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆100Updated this week
- Intel® Tensor Processing Primitives extension for Pytorch*☆10Updated last week
- ☆221Updated this week
- oneAPI Level Zero Conformance & Performance test content☆45Updated last week
- RAND library for HIP programming language☆111Updated this week
- Next generation FFT implementation for ROCm☆173Updated this week
- DaCe - Data Centric Parallel Programming☆490Updated this week
- Loop Kernel Analysis and Performance Modeling Toolkit☆86Updated 2 weeks ago
- Stretching GPU performance for GEMMs and tensor contractions.☆213Updated this week
- RAJA Performance Suite☆110Updated last week
- ROCm Parallel Primitives☆156Updated this week
- Training examples for SYCL☆38Updated 6 months ago
- Reference implementations of MLPerf™ HPC training benchmarks☆39Updated 3 months ago
- An implementation of BLAS using the SYCL open standard.☆250Updated 2 weeks ago