srush / GPU-Puzzles
Solve puzzles. Learn CUDA.
☆10,958Updated 8 months ago
Alternatives and similar repositories for GPU-Puzzles:
Users that are interested in GPU-Puzzles are comparing it to the libraries listed below
- Solve puzzles. Improve your pytorch.☆3,554Updated 9 months ago
- Machine Learning Engineering Open Book☆13,643Updated this week
- Material for gpu-mode lectures☆4,360Updated 3 months ago
- llama3 implementation one matrix multiplication at a time☆14,925Updated 11 months ago
- Tile primitives for speedy kernels☆2,325Updated this week
- Puzzles for learning Triton☆1,614Updated 5 months ago
- GPU programming related news and material links☆1,488Updated 4 months ago
- Development repository for the Triton language and compiler☆15,504Updated this week
- A minimal GPU design in Verilog to learn how GPUs work from the ground up☆8,279Updated 8 months ago
- LLM training in simple, raw C/CUDA☆26,563Updated this week
- The full minitorch student suite.☆2,066Updated 8 months ago
- What would you do with 1000 H100s...☆1,046Updated last year
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆5,946Updated last month
- Explanation to key concepts in ML☆7,563Updated this week
- A PyTorch native library for large-scale model training☆3,675Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆12,076Updated this week
- ☆4,079Updated 11 months ago
- A computer science textbook☆4,115Updated 9 months ago
- Understanding Deep Learning - Simon J.D. Prince☆7,452Updated 3 weeks ago
- "Probabilistic Machine Learning" - a book series by Kevin Murphy☆5,209Updated 3 weeks ago
- Schedule-Free Optimization in PyTorch☆2,154Updated last month
- Samples for CUDA Developers which demonstrates features in CUDA Toolkit☆7,393Updated last week
- MLX: An array framework for Apple silicon☆20,509Updated this week
- Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023☆2,825Updated last month
- 🔥Highlighting the top ML papers every week.☆11,190Updated last month
- High-speed Large Language Model Serving for Local Deployment☆8,191Updated 2 months ago
- An ML Systems Onboarding list☆776Updated 3 months ago
- PyTorch native post-training library☆5,171Updated this week
- A JAX research toolkit for building, editing, and visualizing neural networks.☆1,774Updated 2 weeks ago
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…☆4,633Updated last month