ischlag / Fast-Weight-Memory-publicLinks

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

☆28

Alternatives and similar repositories for Fast-Weight-Memory-public

Users that are interested in Fast-Weight-Memory-public are comparing it to the libraries listed below

Sorting:

RobertCsordas / ndr
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".
☆33Updated last month
zomux / lanmt-ebm
lanmt ebm
☆12Updated 5 years ago
yoonkim / neural-qcfg
☆45Updated 3 years ago
RobertCsordas / transformer_generalization
The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…
☆67Updated 2 years ago
nng555 / ssmba
☆62Updated 3 years ago
INK-USC / NExT
Source Code for paper "Learning from Explanations with Neural Execution Tree", ICLR 2020
☆18Updated 4 years ago
belindal / TaskBench500
Suite of 500 procedurally-generated NLP tasks to study language model adaptability
☆21Updated 3 years ago
CAMTL / CA-MTL
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data
☆57Updated 4 years ago
lifu-tu / ENGINE
ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
☆25Updated 4 years ago
jungokasai / deep-shallow
☆44Updated 4 years ago
yzpang / gold-off-policy-text-gen-iclr21
☆50Updated 3 years ago
tatsu-lab / mlm_inductive_bias
Code Release for "On the Inductive Bias of Masked Language Modeling: From Statistical to Syntactic Dependencies"
☆16Updated 4 years ago
jungokasai / twist_decoding
☆29Updated 3 years ago
thunlp / DPT
☆13Updated 3 years ago
CLAW-Lab / ToM
Code accompanying ICML 2021 paper "Few-shot Language Coordination by Modeling Theory of Mind"
☆19Updated 3 years ago
XuezheMax / fairseq-apollo
FairSeq repo with Apollo optimizer
☆114Updated last year
lucidrains / learning-to-expire-pytorch
An implementation of Transformer with Expire-Span, a circuit for learning which memories to retain
☆34Updated 4 years ago
galsang / trees_from_transformers
Official code for the ICLR 2020 paper 'ARE PPE-TRAINED LANGUAGE MODELS AWARE OF PHRASES? SIMPLE BUT STRONG BASELINES FOR GRAMMAR INDCUTIO…
☆30Updated 2 years ago
lucidrains / memory-transformer-xl
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
☆49Updated 5 years ago
xuanlinli17 / autoregressive_inference
Code for "Discovering Non-monotonic Autoregressive Orderings with Variational Inference" (paper and code updated from ICLR 2021)
☆12Updated last year
JunShern / few-shot-adaptation
Exploring Few-Shot Adaptation of Language Models with Tables
☆24Updated 2 years ago
jason9693 / FROZEN
☆14Updated 3 years ago
HanGuo97 / soft-Q-learning-for-text-generation
☆70Updated 2 years ago
facebookresearch / Permutation-Equivariant-Seq2Seq
Humans understand novel sentences by composing meanings and roles of core language components. In contrast, neural network models for nat…
☆27Updated 5 years ago
harvardnlp / cascaded-generation
Cascaded Text Generation with Markov Transformers
☆129Updated 2 years ago
FranxYao / RDP
Implementation of ICML 22 Paper: Scaling Structured Inference with Randomization
☆14Updated 3 years ago
10-zin / Synthesizer
A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"
☆73Updated 2 years ago
harvardnlp / hmm-lm
☆41Updated 4 years ago
microsoft / EfficientLongSequenceModeling
☆51Updated 2 years ago
google-deepmind / emergent_in_context_learning
☆84Updated last year