ENCCS / gpu-programmingLinks
Meta-GPU lesson covering general aspects of GPU programming as well as specific frameworks
☆98Updated 3 weeks ago
Alternatives and similar repositories for gpu-programming
Users that are interested in gpu-programming are comparing it to the libraries listed below
Sorting:
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆370Updated 7 months ago
- Tensor library & inference framework for machine learning☆115Updated 2 months ago
- Quantum computing without the linear algebra☆77Updated 2 weeks ago
- LLM training in simple, raw C/CUDA☆108Updated last year
- Custom PTX Instruction Benchmark☆136Updated 9 months ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆53Updated this week
- NVIDIA Math Libraries for the Python Ecosystem☆542Updated last month
- High-Performance SGEMM on CUDA devices☆113Updated 10 months ago
- All pdfs of Victor Eijkhout's Art of HPC books and courses☆751Updated last year
- A hands-on introduction to tuning GPU kernels using Kernel Tuner https://github.com/KernelTuner/kernel_tuner/☆36Updated last month
- CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning☆225Updated this week
- Visualization of cache-optimized matrix multiplication☆157Updated 9 months ago
- HIP Python Low-level Bindings☆32Updated last month
- Learning about CUDA by writing PTX code.☆150Updated last year
- Fast and Furious AMD Kernels☆321Updated this week
- Fast GPT-2 inference written in Fortran☆202Updated 3 months ago
- Algebraic enhancements for GEMM & AI accelerators☆282Updated 9 months ago
- Public repository for vol 2 of The Art of HPC: parallel programming☆90Updated 2 months ago
- ☆249Updated last year
- Accelerated General (FP32) Matrix Multiplication from scratch in CUDA☆174Updated 11 months ago
- General Matrix Multiplication using NVIDIA Tensor Cores☆27Updated 10 months ago
- An Online Deep Learning Interface for HPC programs on NVIDIA GPUs☆177Updated last week
- Public repository for The Art of HPC volume 1: Scientific Computing☆64Updated last year
- Machine Learning with Symbolic Tensors☆352Updated last month
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆64Updated last month
- C++ HPC Tutorial materials☆54Updated last month
- A variety of programming models relevant to scientists explained, with an emphasis on how programming constructs map to parts of the com…☆63Updated 7 years ago
- GPUOcelot: A dynamic compilation framework for PTX☆219Updated 10 months ago
- ☆86Updated last month
- A collection of study materials for AI compilers and systems.☆46Updated last month