flozz / pypapi
Python binding for the PAPI (Performance Application Programming Interface) library
☆43Updated this week
Related projects ⓘ
Alternatives and complementary repositories for pypapi
- Python interface for the LIKWID C API (https://github.com/RRZE-HPC/likwid)☆44Updated last year
- Notes and artifacts from the ONNX steering committee☆25Updated last week
- PyProf2: PyTorch Profiling tool☆83Updated 4 years ago
- Automatically insert nvtx ranges to PyTorch models☆17Updated 3 years ago
- A portable interface for energy monitoring utilities☆36Updated 2 months ago
- Bandwidth test for ROCm☆49Updated this week
- A Deep Learning Meta-Framework and HPC Benchmarking Library☆81Updated 2 years ago
- Productionize machine learning predictions, with ONNX or without☆66Updated 10 months ago
- Worked example of the process from Python source to CUDA kernel execution with Numba☆36Updated 2 months ago
- ParaDnn: A systematic performance analysis methodology for deep learning.☆39Updated 4 years ago
- Benchmarks for python☆27Updated 3 months ago
- A GPU performance prediction toolkit for CUDA programs☆16Updated 5 years ago
- Codebase associated with the PyTorch compiler tutorial☆44Updated 5 years ago
- Python bindings for UCX☆121Updated this week
- A GPU performance profiling tool for PyTorch models☆22Updated 2 years ago
- pytorch ucc plugin☆17Updated 3 years ago
- Tools and extensions for CUDA profiling☆63Updated 4 years ago
- NVIDIA's launch, startup, and logging scripts used by our MLPerf Training and HPC submissions☆22Updated 3 weeks ago
- Python bindings for NVTX☆66Updated last year
- A thin wrapper around miOpen and cuDNN☆38Updated last year
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆22Updated last month
- Experiments evaluating preemption on the NVIDIA Pascal architecture☆18Updated 8 years ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Updated 2 years ago
- A Ray-based data loader with per-epoch shuffling and configurable pipelining, for shuffling and loading training data for distributed tra…☆18Updated last year
- A tracing JIT for PyTorch☆17Updated 2 years ago
- XLA integration of Open Neural Network Exchange (ONNX)☆19Updated 6 years ago
- MLPerf™ logging library☆30Updated this week
- Data and tooling to compare the API surfaces of various array libraries.☆54Updated 5 months ago
- The ROCdebug-agent is a library that can be loaded by ROCm Platform Runtime to provide some debugging functionality.☆23Updated this week
- Torch Frontend for IREE☆25Updated 11 months ago