rkinas / cuda-learningLinks
This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mastering CUDA programming. Whether you're just starting or looking to optimize and scale your GPU-accelerated applications.
☆348Updated 3 months ago
Alternatives and similar repositories for cuda-learning
Users that are interested in cuda-learning are comparing it to the libraries listed below
Sorting:
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆357Updated 2 months ago
- ☆328Updated last month
- 100 days of building GPU kernels!☆430Updated last month
- GPU Kernels☆178Updated last month
- A 120-day CUDA learning plan covering daily concepts, exercises, pitfalls, and references (including “Programming Massively Parallel Proc…☆683Updated 2 months ago
- Learnings and programs related to CUDA☆402Updated 3 months ago
- An ML Systems Onboarding list☆794Updated 4 months ago
- ☆255Updated 4 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆184Updated last week
- Apply GPU in ML and DL☆52Updated 3 months ago
- ☆168Updated 5 months ago
- ☆1,148Updated last month
- CUDA tutorials for Maths & ML tutorials with examples, covers multi-gpus, fused attention, winograd convolution, reinforcement learning.☆183Updated last month
- GPU programming related news and material links☆1,540Updated 4 months ago
- UNet diffusion model in pure CUDA☆606Updated 11 months ago
- The Tensor (or Array)☆433Updated 9 months ago
- small auto-grad engine inspired from Karpathy's micrograd and PyTorch☆268Updated 6 months ago
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆181Updated 3 weeks ago
- CUDA Learning guide☆382Updated 11 months ago
- Some CUDA example code with READMEs.☆99Updated 3 months ago
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆218Updated 5 months ago
- High Quality Resources on GPU Programming/Architecture☆587Updated 10 months ago
- Recreating PyTorch from scratch (C/C++, CUDA, NCCL and Python, with multi-GPU support and automatic differentiation!)☆151Updated 11 months ago
- PyTorch implementations of algorithms from "Reinforcement Learning: An Introduction by Sutton and Barto", along with various RL research …☆138Updated last week
- Alex Krizhevsky's original code from Google Code☆192Updated 9 years ago
- ☆157Updated last year
- (WIP) A small but powerful, homemade PyTorch from scratch.☆553Updated this week
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆351Updated last month
- ☆408Updated this week
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"☆29Updated last month