dshah3 / GPU-Puzzles
Solve puzzles. Learn CUDA.
☆61Updated last year
Alternatives and similar repositories for GPU-Puzzles:
Users that are interested in GPU-Puzzles are comparing it to the libraries listed below
- ☆85Updated 11 months ago
- ☆140Updated 11 months ago
- ☆75Updated 6 months ago
- seqax = sequence modeling + JAX☆136Updated 6 months ago
- Experiment of using Tangent to autodiff triton☆74Updated last year
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆121Updated 9 months ago
- ☆204Updated 6 months ago
- JAX implementation of the Llama 2 model☆213Updated 11 months ago
- A puzzle to learn about prompting☆123Updated last year
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆167Updated last year
- supporting pytorch FSDP for optimizers☆75Updated last month
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆116Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆91Updated 2 months ago
- ☆53Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆102Updated last month
- Simple Transformer in Jax☆130Updated 7 months ago
- Minimal but scalable implementation of large language models in JAX☆28Updated 2 months ago
- A really tiny autograd engine☆89Updated 9 months ago
- A simple library for scaling up JAX programs☆129Updated 2 months ago
- An implementation of the Llama architecture, to instruct and delight☆21Updated 2 weeks ago
- ☆58Updated 2 years ago
- WIP☆93Updated 5 months ago
- ring-attention experiments☆119Updated 3 months ago
- Implementation of Diffusion Transformer (DiT) in JAX☆261Updated 7 months ago
- Puzzles for exploring transformers☆331Updated last year
- ☆27Updated 6 months ago
- Understand and test language model architectures on synthetic tasks.☆177Updated 2 weeks ago
- A set of Python scripts that makes your experience on TPU better☆44Updated 6 months ago
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems☆99Updated last week
- ☆171Updated last week