srush / GPU-Puzzles
Solve puzzles. Learn CUDA.
☆10,471Updated 5 months ago
Alternatives and similar repositories for GPU-Puzzles:
Users that are interested in GPU-Puzzles are comparing it to the libraries listed below
- Solve puzzles. Improve your pytorch.☆3,409Updated 6 months ago
- Machine Learning Engineering Open Book☆12,714Updated last week
- Puzzles for learning Triton☆1,375Updated 2 months ago
- Development repository for the Triton language and compiler☆14,324Updated this week
- What would you do with 1000 H100s...☆992Updated last year
- Tile primitives for speedy kernels☆1,995Updated this week
- Fast and memory-efficient exact attention☆15,392Updated this week
- Material for gpu-mode lectures☆3,663Updated this week
- llama3 implementation one matrix multiplication at a time☆14,107Updated 8 months ago
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)☆8,713Updated this week
- The full minitorch student suite.☆2,001Updated 5 months ago
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more☆31,215Updated this week
- A playbook for systematically maximizing the performance of deep learning models.☆27,971Updated 7 months ago
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆5,777Updated last month
- Flax is a neural network library for JAX that is designed for flexibility.☆6,317Updated this week
- GPU programming related news and material links☆1,358Updated last month
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆2,919Updated this week
- A minimal GPU design in Verilog to learn how GPUs work from the ground up☆7,585Updated 5 months ago
- Train transformer language models with reinforcement learning.☆11,398Updated this week
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆8,293Updated this week
- Language model alignment-focused deep learning curriculum☆1,321Updated 5 months ago
- An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.☆5,115Updated last week
- A Python framework for high performance GPU simulation and graphics☆4,540Updated this week
- Inference Llama 2 in one file of pure C☆18,007Updated 6 months ago
- A JAX research toolkit for building, editing, and visualizing neural networks.☆1,727Updated last month
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API☆11,113Updated 6 months ago
- You like pytorch? You like micrograd? You love tinygrad! ❤️☆27,906Updated this week
- JAX - A curated list of resources https://github.com/google/jax☆1,686Updated 7 months ago
- CUDA Templates for Linear Algebra Subroutines☆6,178Updated this week
- The Art of Debugging☆853Updated 6 months ago