srush / GPU-PuzzlesLinks
Solve puzzles. Learn CUDA.
☆11,180Updated 9 months ago
Alternatives and similar repositories for GPU-Puzzles
Users that are interested in GPU-Puzzles are comparing it to the libraries listed below
Sorting:
- Solve puzzles. Improve your pytorch.☆3,609Updated 11 months ago
- Machine Learning Engineering Open Book☆14,082Updated 2 weeks ago
- LLM training in simple, raw C/CUDA☆26,962Updated last month
- The full minitorch student suite.☆2,121Updated 10 months ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆9,717Updated 11 months ago
- NanoGPT (124M) in 3 minutes☆2,699Updated last week
- Puzzles for learning Triton☆1,726Updated 7 months ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆12,343Updated this week
- Video+code lecture on building nanoGPT from scratch☆4,168Updated 10 months ago
- Neural Networks: Zero to Hero☆14,094Updated 10 months ago
- Material for gpu-mode lectures☆4,636Updated last week
- llama3 implementation one matrix multiplication at a time☆15,011Updated last year
- Kolmogorov Arnold Networks☆15,745Updated 5 months ago
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API☆12,173Updated 10 months ago
- Schedule-Free Optimization in PyTorch☆2,180Updated last month
- This repository contains demos I made with the Transformers library by HuggingFace.☆11,006Updated last month
- Inference Llama 2 in one file of pure C☆18,491Updated 10 months ago
- "Probabilistic Machine Learning" - a book series by Kevin Murphy☆5,258Updated 2 months ago
- A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆22,142Updated 10 months ago
- Tensor library for machine learning☆12,712Updated this week
- Implement a ChatGPT-like LLM in PyTorch from scratch, step by step☆52,336Updated this week
- Development repository for the Triton language and compiler☆15,939Updated this week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆42,229Updated 6 months ago
- Tile primitives for speedy kernels☆2,478Updated this week
- PyTorch native post-training library☆5,287Updated this week
- You like pytorch? You like micrograd? You love tinygrad! ❤️☆29,472Updated this week
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,589Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆50,864Updated this week
- A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support☆15,730Updated this week
- Implementation for MatMul-free LM.☆3,010Updated 7 months ago