ENCCS / gpu-programmingLinks
Meta-GPU lesson covering general aspects of GPU programming as well as specific frameworks
☆91Updated last week
Alternatives and similar repositories for gpu-programming
Users that are interested in gpu-programming are comparing it to the libraries listed below
Sorting:
- LLM training in simple, raw C/CUDA☆107Updated last year
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆367Updated 6 months ago
- High-Performance SGEMM on CUDA devices☆109Updated 9 months ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆52Updated this week
- Custom PTX Instruction Benchmark☆131Updated 8 months ago
- Visualization of cache-optimized matrix multiplication☆155Updated 7 months ago
- HIP Python Low-level Bindings☆30Updated last week
- Public repository for vol 2 of The Art of HPC: parallel programming☆91Updated last month
- All pdfs of Victor Eijkhout's Art of HPC books and courses☆729Updated last year
- Learning about CUDA by writing PTX code.☆146Updated last year
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆62Updated 2 weeks ago
- NVIDIA Math Libraries for the Python Ecosystem☆532Updated 2 months ago
- A hands-on introduction to tuning GPU kernels using Kernel Tuner https://github.com/KernelTuner/kernel_tuner/☆35Updated last week
- GPUOcelot: A dynamic compilation framework for PTX☆211Updated 9 months ago
- The Foundation for All Legate Libraries☆231Updated this week
- Competitive GPU kernel optimization platform.☆113Updated last week
- pytorch from scratch in pure C/CUDA and python☆41Updated last year
- The CUDA target for Numba☆207Updated last week
- Accelerated General (FP32) Matrix Multiplication from scratch in CUDA☆166Updated 10 months ago
- Learn GPU Programming in Mojo🔥 by Solving Puzzles☆195Updated last week
- This repository collects the materials from the course "Foundations of HPC", 2021, at the Data Science and Scientific Computing Departmen…☆23Updated 3 years ago
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆116Updated last week
- Quantum computing without the linear algebra☆76Updated 4 months ago
- Nvidia Instruction Set Specification Generator☆297Updated last year
- throwaway GPT inference☆140Updated last year
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- ☆136Updated 2 years ago
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆214Updated last year
- A variety of programming models relevant to scientists explained, with an emphasis on how programming constructs map to parts of the com…☆63Updated 7 years ago
- Little OpenMP Library☆168Updated 3 years ago