srush / Transformer-Puzzles
Puzzles for exploring transformers
☆333Updated last year
Alternatives and similar repositories for Transformer-Puzzles:
Users that are interested in Transformer-Puzzles are comparing it to the libraries listed below
- ☆420Updated 5 months ago
- What would you do with 1000 H100s...☆1,016Updated last year
- An interactive exploration of Transformer programming.☆261Updated last year
- A puzzle to learn about prompting☆124Updated last year
- ☆214Updated 8 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆557Updated this week
- Annotated version of the Mamba paper☆475Updated last year
- For optimization algorithm research and development.☆498Updated this week
- Solve puzzles. Learn CUDA.☆63Updated last year
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆370Updated this week
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.☆206Updated last year
- ☆301Updated 9 months ago
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆524Updated last month
- Named tensors with first-class dimensions for PyTorch☆321Updated last year
- ☆165Updated last year
- Puzzles for learning Triton☆1,508Updated 4 months ago
- A Jax-based library for designing and training transformer models from scratch.☆282Updated 6 months ago
- Building blocks for foundation models.☆464Updated last year
- seqax = sequence modeling + JAX☆148Updated last week
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆255Updated last year
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆783Updated 2 weeks ago
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆343Updated 7 months ago
- git extension for {collaborative, communal, continual} model development☆208Updated 4 months ago
- jax-triton contains integrations between JAX and OpenAI Triton☆384Updated last week
- A tool to analyze and debug neural networks in pytorch. Use a GUI to traverse the computation graph and view the data from many different…☆282Updated 3 months ago
- Language Modeling with the H3 State Space Model☆516Updated last year
- ☆525Updated last year
- A repository for research on medium sized language models.☆493Updated 2 months ago
- ☆407Updated 8 months ago
- Implementation of Diffusion Transformer (DiT) in JAX☆269Updated 9 months ago