ENCCS / gpu-programming
Meta-GPU lesson covering general aspects of GPU programming as well as specific frameworks
☆67Updated this week
Related projects ⓘ
Alternatives and complementary repositories for gpu-programming
- LLM training in simple, raw C/CUDA☆86Updated 6 months ago
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆44Updated last month
- All pdfs of Victor Eijkhout's Art of HPC books and courses☆517Updated 7 months ago
- High-Performance FP32 Matrix Multiplication on CPU☆301Updated this week
- Public repository for The Art of HPC volume 1: Scientific Computing☆44Updated 7 months ago
- Repository with examples and exercises for OLCF and AMD's HIP training series☆14Updated last year
- Exploring the scalable matrix extension of the Apple M4 processor☆135Updated 2 weeks ago
- A variety of programming models relevant to scientists explained, with an emphasis on how programming constructs map to parts of the com…☆58Updated 6 years ago
- GPUOcelot: A dynamic compilation framework for PTX☆147Updated last month
- A hands-on introduction to tuning GPU kernels using Kernel Tuner https://github.com/KernelTuner/kernel_tuner/☆29Updated 2 months ago
- Data and reproducibility scripts for the UoB-HPC Performance Portability studies☆14Updated 5 months ago
- Nvidia Instruction Set Specification Generator☆216Updated 4 months ago
- Metal Shading Language on Apple M1's GPU for scientific C++.☆82Updated last year
- SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) sy…☆93Updated 3 weeks ago
- NVIDIA Math Libraries for the Python Ecosystem☆205Updated this week
- AI Training Series Material☆26Updated last month
- LLM inference in Fortran☆53Updated 5 months ago
- A package for defining deep learning models using categorical algebraic expressions.☆56Updated 3 months ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆27Updated this week
- Fast GPT-2 inference written in Fortran☆187Updated 8 months ago
- An Online Deep Learning Interface for HPC programs on NVIDIA GPUs☆156Updated 2 weeks ago
- Public repository for vol 2 of The Art of HPC: parallel programming☆66Updated 7 months ago
- AMD’s C++ library for accelerating tensor primitives☆35Updated this week
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆63Updated this week
- Deep learning accelerator architectures requiring half the multipliers☆263Updated 7 months ago
- An implementation of HIP that works on CPUs, across OSes.☆112Updated 8 months ago
- hipFFT is a FFT marshalling library.☆54Updated this week
- A framework that support executing unmodified CUDA source code on non-NVIDIA devices.☆105Updated 3 months ago
- Slides/notes and Jupyter notebook demos for an introductory course of numerical analysis/scientific computing☆50Updated this week
- ☆52Updated last week