NVIDIA / accelerated-computing-hub
NVIDIA curated collection of educational resources related to general purpose GPU programming.
☆199Updated last week
Alternatives and similar repositories for accelerated-computing-hub:
Users that are interested in accelerated-computing-hub are comparing it to the libraries listed below
- The CUDA target for Numba☆60Updated this week
- NVIDIA Math Libraries for the Python Ecosystem☆235Updated 2 months ago
- The Foundation for All Legate Libraries☆204Updated last week
- KvikIO - High Performance File IO☆182Updated this week
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆610Updated 3 months ago
- CUDA Kernel Benchmarking Library☆561Updated 3 months ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆36Updated this week
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆209Updated 2 months ago
- Kernel Tuner☆311Updated last week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆303Updated this week
- Training material for Nsight developer tools☆148Updated 6 months ago
- Python SYCL bindings and SYCL-based Python Array API library☆109Updated this week
- N-Ways to Multi-GPU Programming☆16Updated last year
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆48Updated 4 months ago
- NVIDIA tools guide☆102Updated last month
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆130Updated 4 years ago
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆347Updated this week
- Reference implementations of MLPerf™ HPC training benchmarks☆45Updated 8 months ago
- Data Parallel Extension for Numba☆79Updated 3 months ago
- Training materials provided by OpenACC.org.☆87Updated 6 months ago
- A hands-on introduction to tuning GPU kernels using Kernel Tuner https://github.com/KernelTuner/kernel_tuner/☆30Updated 5 months ago
- RAPIDS Memory Manager☆534Updated this week
- Experimental projects related to TensorRT☆89Updated this week
- collection of benchmarks to measure basic GPU capabilities☆296Updated last week
- NPBench - A Benchmarking Suite for High-Performance NumPy☆77Updated this week
- Advanced Profiling and Analytics for AMD Hardware☆140Updated this week
- Kokkos C++ Performance Portability Programming Ecosystem: Profiling and Debugging Tools☆118Updated last month
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆198Updated 2 months ago
- CUDA Matrix Multiplication Optimization☆161Updated 7 months ago
- Exploring using stdpar and Cython☆33Updated 4 years ago