google-research / meliad
☆253Updated 2 years ago
Alternatives and similar repositories for meliad:
Users that are interested in meliad are comparing it to the libraries listed below
- Neural Networks and the Chomsky Hierarchy☆196Updated 9 months ago
- Recurrent Memory Transformer☆148Updated last year
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆403Updated 3 weeks ago
- Understand and test language model architectures on synthetic tasks.☆177Updated 2 weeks ago
- Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023☆130Updated 9 months ago
- ☆336Updated 9 months ago
- ☆164Updated last year
- Emergent world representations: Exploring a sequence model trained on a synthetic task☆174Updated last year
- Sequence modeling with Mega.☆297Updated 2 years ago
- ☆136Updated last year
- Inference code for LLaMA models in JAX☆114Updated 8 months ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆205Updated 5 months ago
- An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"☆301Updated 4 months ago
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆225Updated 4 months ago
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆219Updated last month
- ☆515Updated 11 months ago
- Tools for understanding how transformer predictions are built layer-by-layer☆461Updated 7 months ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆545Updated last month
- Implementation of Block Recurrent Transformer - Pytorch☆217Updated 5 months ago
- Train very large language models in Jax.☆198Updated last year
- JAX implementation of the Llama 2 model☆213Updated 11 months ago
- Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …☆629Updated last year
- TART: A plug-and-play Transformer module for task-agnostic reasoning☆194Updated last year
- Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"☆203Updated last year
- Language Modeling with the H3 State Space Model☆515Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆536Updated this week
- Code for the ALiBi method for transformer language models (ICLR 2022)☆512Updated last year
- Implementation of https://srush.github.io/annotated-s4☆479Updated last year
- ☆203Updated 6 months ago
- ☆169Updated last year