srush / raspyLinks
An interactive exploration of Transformer programming.
☆268Updated last year
Alternatives and similar repositories for raspy
Users that are interested in raspy are comparing it to the libraries listed below
Sorting:
- An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"☆318Updated 10 months ago
- git extension for {collaborative, communal, continual} model development☆217Updated 8 months ago
- Puzzles for exploring transformers☆355Updated 2 years ago
- ☆274Updated last year
- Extract full next-token probabilities via language model APIs☆247Updated last year
- ☆540Updated last year
- Resources from the EleutherAI Math Reading Group☆53Updated 5 months ago
- ☆443Updated 9 months ago
- A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.☆291Updated 11 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆627Updated this week
- A puzzle to learn about prompting☆132Updated 2 years ago
- Automatic gradient descent☆207Updated 2 years ago
- seqax = sequence modeling + JAX☆165Updated last week
- ☆166Updated 2 years ago
- Functional local implementations of main model parallelism approaches☆94Updated 2 years ago
- Train very large language models in Jax.☆205Updated last year
- Erasing concepts from neural representations with provable guarantees☆231Updated 6 months ago
- Understand and test language model architectures on synthetic tasks.☆221Updated 3 weeks ago
- 🧱 Modula software package☆210Updated last week
- Neural Networks and the Chomsky Hierarchy☆207Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆127Updated 2 years ago
- ☆134Updated 4 months ago
- Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…☆207Updated 2 months ago
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆349Updated last year
- Fast bare-bones BPE for modern tokenizer training☆160Updated last month
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)☆94Updated 7 months ago
- Tools for working with the Abstraction & Reasoning Corpus☆196Updated 11 months ago
- Solve puzzles. Learn CUDA.☆64Updated last year
- JAX implementation of the Llama 2 model☆219Updated last year
- Emergent world representations: Exploring a sequence model trained on a synthetic task☆184Updated 2 years ago