modular / mojo-gpu-puzzlesLinks
Learn GPU Programming in Mojo🔥 by Solving Puzzles
☆121Updated last week
Alternatives and similar repositories for mojo-gpu-puzzles
Users that are interested in mojo-gpu-puzzles are comparing it to the libraries listed below
Sorting:
- Scientific Computing in Python 🐍 with Mojo 🔥 acceleration☆274Updated 2 weeks ago
- port of Andrjey Karpathy's llm.c to Mojo☆357Updated 3 weeks ago
- A Machine Learning framework from scratch in Pure Mojo 🔥☆442Updated 7 months ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆51Updated this week
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆292Updated 2 weeks ago
- ☆28Updated 11 months ago
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆136Updated last year
- Implementation of Karpathy's micrograd in Mojo☆76Updated last year
- A Learning Journey: Micrograd in Mojo 🔥☆61Updated 10 months ago
- Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs☆562Updated this week
- A working machine learning framework in pure Mojo 🔥☆131Updated last year
- Machine Learning algorithms in pure Mojo 🔥☆39Updated this week
- PyTorch Single Controller☆374Updated this week
- Where GPUs get cooked 👩🍳🔥☆279Updated 3 weeks ago
- Learning about CUDA by writing PTX code.☆135Updated last year
- Dion optimizer algorithm☆318Updated last week
- NuMojo is a library for numerical computing in Mojo 🔥 similar to numpy in Python.☆177Updated 2 weeks ago
- SIMD quantization kernels☆83Updated last week
- Tensor library with autograd using only Rust's standard library☆69Updated last year
- Competitive GPU kernel optimization platform.☆95Updated 3 weeks ago
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆189Updated last year
- jax-triton contains integrations between JAX and OpenAI Triton☆415Updated 2 months ago
- The Tensor (or Array)☆441Updated last year
- GPU documentation for humans☆131Updated this week
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆111Updated this week
- High-Performance SGEMM on CUDA devices☆97Updated 7 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆191Updated 2 months ago
- ☆54Updated 3 weeks ago
- ☆13Updated 3 weeks ago
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆74Updated this week