modular / mojo-gpu-puzzlesLinks
Learn GPU Programming in Mojo🔥 by Solving Puzzles
☆236Updated last week
Alternatives and similar repositories for mojo-gpu-puzzles
Users that are interested in mojo-gpu-puzzles are comparing it to the libraries listed below
Sorting:
- Machine Learning library for the emerging Mojo/Python ecosystem☆295Updated last week
- port of Andrjey Karpathy's llm.c to Mojo☆360Updated 3 months ago
- Quantized LLM training in pure CUDA/C++.☆216Updated this week
- A Machine Learning framework from scratch in Pure Mojo 🔥☆438Updated 10 months ago
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆150Updated 2 years ago
- Simple MPI implementation for prototyping or learning☆288Updated 3 months ago
- Learning about CUDA by writing PTX code.☆147Updated last year
- Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs☆691Updated this week
- Write a fast kernel and run it on Discord. See how you compare against the best!☆61Updated last week
- Competitive GPU kernel optimization platform.☆135Updated this week
- PyTorch Single Controller☆901Updated this week
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆196Updated 5 months ago
- A Learning Journey: Micrograd in Mojo 🔥☆63Updated last year
- Solve puzzles to improve your tinygrad skills!☆151Updated last month
- Tensor library with autograd using only Rust's standard library☆70Updated last year
- Where GPUs get cooked 👩🍳🔥☆317Updated 2 months ago
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆158Updated last week
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆432Updated 8 months ago
- SIMD quantization kernels☆92Updated 2 months ago
- Tutorials on tinygrad☆439Updated last month
- (WIP) A small but powerful, homemade PyTorch from scratch.☆659Updated this week
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆195Updated 2 years ago
- ☆28Updated last year
- A working machine learning framework in pure Mojo 🔥☆129Updated last year
- Implementation of Karpathy's micrograd in Mojo☆78Updated 2 years ago
- Alex Krizhevsky's original code from Google Code☆198Updated 9 years ago
- Fast and Furious AMD Kernels☆278Updated this week
- Complete solutions to the Programming Massively Parallel Processors Edition 4☆582Updated 5 months ago
- High-Performance SGEMM on CUDA devices☆110Updated 10 months ago
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆303Updated 2 weeks ago