ClashLuke / MinRETRO

Reimplementation of `Improving language models by retrieving from trillions of tokens`

☆17

Related projects ⓘ

Alternatives and complementary repositories for MinRETRO

jxiw / BiGS
Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …
☆114Updated 8 months ago
HomebrewNLP / HomebrewNLP
A case study of efficient training of large language models using commodity hardware.
☆68Updated 2 years ago
lucidrains / einops-exts
Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️
☆52Updated last year
hadasah / btm
☆71Updated 6 months ago
irhum / hyena
JAX/Flax implementation of the Hyena Hierarchy
☆31Updated last year
lucidrains / mlp-gpt-jax
A GPT, made only of MLPs, in Jax
☆55Updated 3 years ago
lucidrains / token-shift-gpt
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆47Updated 2 years ago
dkopi / Bitune
Implementation of Bitune: Bidirectional Instruction-Tuning
☆15Updated 5 months ago
HomebrewNLP / Olmax
HomebrewNLP in JAX flavour for maintable TPU-Training
☆46Updated 10 months ago
krandiash / quinine
A library to create and manage configuration files, especially for machine learning projects.
☆77Updated 2 years ago
lucidrains / g-mlp-gpt
GPT, but made only out of MLPs
☆87Updated 3 years ago
lucidrains / gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
☆95Updated last year
automl / zero-shot-automl-with-pretrained-models
Official repository for the paper "Zero-Shot AutoML with Pretrained Models"
☆41Updated 10 months ago
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆95Updated 6 months ago
shikaiqiu / compute-better-spent
☆46Updated last month
google-deepmind / asyncdiloco
☆39Updated 10 months ago
EdinburghNLP / torch-adaptive-imle
☆34Updated last year
sholtodouglas / scalingExperiments
☆57Updated 2 years ago
lucidrains / memory-editable-transformer
My explorations into editing the knowledge and memories of an attention network
☆34Updated last year
IdoAmos / not-from-scratch
☆26Updated last month
catie-aq / flashT5
A fast implementation of T5/UL2 in PyTorch using Flash Attention
☆71Updated last month
lucidrains / ponder-transformer
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper
☆79Updated 3 years ago
gregorbachmann / scaling_mlps
☆51Updated 5 months ago
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆163Updated 6 months ago
lucidrains / panoptic-transformer
Another attempt at a long-context / efficient transformer by me
☆37Updated 2 years ago
lucidrains / product-key-memory
Standalone Product Key Memory module in Pytorch - for augmenting Transformer models
☆73Updated 3 months ago
google-research / jestimator
Amos optimizer with JEstimator lib.
☆81Updated 6 months ago
crowsonkb / LDLM
Latent Diffusion Language Models
☆67Updated last year