srush / Transformer-Puzzles
Puzzles for exploring transformers
☆325Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Transformer-Puzzles
- ☆391Updated last month
- What would you do with 1000 H100s...☆903Updated 10 months ago
- A puzzle to learn about prompting☆121Updated last year
- An interactive exploration of Transformer programming.☆246Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆516Updated this week
- ☆197Updated 4 months ago
- Annotated version of the Mamba paper☆457Updated 8 months ago
- Resources from the EleutherAI Math Reading Group☆51Updated last month
- For optimization algorithm research and development.☆449Updated this week
- Tools for understanding how transformer predictions are built layer-by-layer☆430Updated 5 months ago
- Solve puzzles. Learn CUDA.☆61Updated 11 months ago
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.☆200Updated 9 months ago
- ☆161Updated last year
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- Fast bare-bones BPE for modern tokenizer training☆142Updated last month
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆334Updated 3 months ago
- ☆292Updated 4 months ago
- Puzzles for learning Triton☆1,135Updated this week
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆715Updated last month
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆333Updated 3 weeks ago
- Extract full next-token probabilities via language model APIs☆229Updated 8 months ago
- ☆224Updated 4 months ago
- Named tensors with first-class dimensions for PyTorch☆322Updated last year
- Mechanistic Interpretability Visualizations using React☆198Updated 4 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆252Updated last year
- Building blocks for foundation models.☆394Updated 10 months ago
- Highly commented implementations of Transformers in PyTorch☆128Updated last year
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆483Updated 3 weeks ago
- seqax = sequence modeling + JAX☆133Updated 4 months ago
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆157Updated last year