srush / GPU-PuzzlesLinks
Solve puzzles. Learn CUDA.
☆11,834Updated last year
Alternatives and similar repositories for GPU-Puzzles
Users that are interested in GPU-Puzzles are comparing it to the libraries listed below
Sorting:
- Solve puzzles. Improve your pytorch.☆3,851Updated last year
- The full minitorch student suite.☆2,260Updated last year
- Material for gpu-mode lectures☆5,432Updated 2 weeks ago
- Puzzles for learning Triton☆2,187Updated last year
- A minimal GPU design in Verilog to learn how GPUs work from the ground up☆8,993Updated last year
- GPU programming related news and material links☆1,874Updated 3 months ago
- Development repository for the Triton language and compiler☆17,920Updated this week
- CUDA Templates and Python DSLs for High-Performance Linear Algebra☆8,991Updated this week
- Tile primitives for speedy kernels☆3,008Updated 2 weeks ago
- Machine Learning Engineering Open Book☆16,071Updated this week
- Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)☆925Updated last year
- A Python framework for accelerated simulation, data generation and spatial computing.☆5,945Updated this week
- A PyTorch native platform for training generative AI models☆4,866Updated this week
- NanoGPT (124M) in 3 minutes☆3,974Updated this week
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API☆14,078Updated last year
- ☆2,334Updated last month
- Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch☆906Updated 2 years ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,923Updated 3 months ago
- CUDA Python: Performance meets Productivity☆3,098Updated this week
- Tensor library for machine learning☆13,743Updated last week
- What would you do with 1000 H100s...☆1,133Updated last year
- Schedule-Free Optimization in PyTorch☆2,241Updated 7 months ago
- LLM training in simple, raw C/CUDA☆28,414Updated 5 months ago
- A lightweight library for portable low-level GPU computation using WebGPU.☆3,923Updated 2 months ago
- This project is a stock trend prediction web application created using Python and Streamlit. The purpose of this web application is to al…☆10Updated 2 years ago
- CUDA Learning guide☆500Updated last year
- Learn CUDA Programming, published by Packt☆1,218Updated last year
- An ML Systems Onboarding list☆957Updated 11 months ago
- PyTorch native quantization and sparsity for training and inference☆2,591Updated this week
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…☆4,695Updated last week