ischlag/fast-weight-transformers

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ischlag/fast-weight-transformers)

ischlag / fast-weight-transformers

Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.

☆115

Alternatives and similar repositories for fast-weight-transformers

Users that are interested in fast-weight-transformers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

IDSIA / recurrent-fwp
View on GitHub
Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers" (NeurIPS 2021)
☆52Jun 11, 2025Updated last year
IDSIA / lmtool-fwp
View on GitHub
PyTorch Language Modeling Toolkit for Fast Weight Programmers
☆22Jun 11, 2025Updated last year
giuseppepastore10 / STRICT
View on GitHub
Official code for the paper: "A Closer Look at Self-training for Zero-Label Semantic Segmentation" https://arxiv.org/abs/2104.11692
☆25Aug 22, 2021Updated 4 years ago
IDSIA / modern-srwm
View on GitHub
Official repository for the paper "A Modern Self-Referential Weight Matrix That Learns to Modify Itself" (ICML 2022 & NeurIPS 2021 Deep R…
☆177Jun 11, 2025Updated last year
ArneBinder / GlomImpl
View on GitHub
Implementation of the GLOM model for text
☆11Mar 4, 2021Updated 5 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
BayesWatch / deficient-efficient
View on GitHub
Successfully training approximations to full-rank matrices for efficiency in deep learning.
☆16Jan 5, 2021Updated 5 years ago
MicroSTM / AGENT-synthesis
View on GitHub
Data synthesis code for "AGENT: A Benchmark for Core Psychological Reasoning"
☆24Mar 3, 2022Updated 4 years ago
akbir / deq-jax
View on GitHub
[NeurIPS'19] Deep Equilibrium Models Jax Implementation
☆43Oct 26, 2020Updated 5 years ago
ltgoslo / factorizer
View on GitHub
☆16May 14, 2024Updated 2 years ago
Noahs-ARK / RFA
View on GitHub
☆33Apr 12, 2021Updated 5 years ago
yilundu / ebm_compositionality
View on GitHub
[NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models
☆47Mar 24, 2023Updated 3 years ago
mlpen / Nystromformer
View on GitHub
☆392Oct 18, 2023Updated 2 years ago
idiap / fast-transformers
View on GitHub
Pytorch library for fast transformer implementations
☆1,775Mar 23, 2023Updated 3 years ago
google-research / long-range-arena
View on GitHub
Long Range Arena for Benchmarking Efficient Transformers
☆787Dec 16, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jinxu06 / gsubsampling
View on GitHub
Reference implementation for "Group Equivariant Subsampling"
☆16Dec 13, 2021Updated 4 years ago
Chenglin-Yang / LESA_classification
View on GitHub
Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms
☆11Nov 29, 2021Updated 4 years ago
mjliu2020 / RandomFR
View on GitHub
☆10Jan 5, 2024Updated 2 years ago
RobertCsordas / moe_layer
View on GitHub
sigma-MoE layer
☆21Jan 5, 2024Updated 2 years ago
facebookresearch / Addressing-the-Topological-Defects-of-Disentanglement
View on GitHub
Repo reproducing experimental results in "Addressing the Topological Defects of Disentanglement"
☆24Jul 15, 2022Updated 4 years ago
ChristophReich1996 / ToeffiPy
View on GitHub
ToeffiPy is a PyTorch like autograd/deep learning library based only on NumPy.
☆16Mar 28, 2022Updated 4 years ago
lsj2408 / GraphNorm
View on GitHub
[ICML 2021] GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training (official implementation)
☆106Dec 19, 2022Updated 3 years ago
lucidrains / performer-pytorch
View on GitHub
An implementation of Performer, a linear attention-based transformer, in Pytorch
☆1,179Feb 2, 2022Updated 4 years ago
ofirpress / shortformer
View on GitHub
Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.
☆147Jul 26, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cpcp1998 / PermuteFormer
View on GitHub
Code for the paper PermuteFormer
☆42Oct 10, 2021Updated 4 years ago
OpenNLPLab / HGRN
View on GitHub
[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…
☆68Apr 24, 2024Updated 2 years ago
yoonkim / neural-qcfg
View on GitHub
☆45Oct 11, 2021Updated 4 years ago
lottery-ticket / code
View on GitHub
☆13Mar 8, 2020Updated 6 years ago
radarFudan / Curse-of-memory
View on GitHub
Curse-of-memory phenomenon of RNNs in sequence modelling
☆19May 8, 2025Updated last year
stanford-futuredata / sinkhorn-label-allocation
View on GitHub
Sinkhorn Label Allocation is a label assignment method for semi-supervised self-training algorithms. The SLA algorithm is described in fu…
☆54Jun 15, 2021Updated 5 years ago
jjzha / cartography-al
View on GitHub
Code base for the EMNLP 2021 Findings paper: Cartography Active Learning
☆14Jun 3, 2025Updated last year
ag1988 / top_k_attention
View on GitHub
The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…
☆70Sep 19, 2021Updated 4 years ago
showgood / onlisp
View on GitHub
Paul Graham's onlisp book in org mode format
☆22May 11, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
mpatacchiola / self-supervised-relational-reasoning
View on GitHub
Official PyTorch implementation of the paper "Self-Supervised Relational Reasoning for Representation Learning", NeurIPS 2020 Spotlight.
☆143Apr 21, 2024Updated 2 years ago
foxlf823 / DilatedRnn
View on GitHub
A PyTorch implement of Dilated RNN
☆11Dec 31, 2017Updated 8 years ago
gmongaras / Cottention_Transformer
View on GitHub
Code for the paper "Cottention: Linear Transformers With Cosine Attention"
☆20Nov 15, 2025Updated 8 months ago
shreyansh26 / Attention-Mask-Patterns
View on GitHub
Using FlexAttention to compute attention with different masking patterns
☆47Sep 22, 2024Updated last year
lucidrains / ESBN-pytorch
View on GitHub
Usable implementation of Emerging Symbol Binding Network (ESBN), in Pytorch
☆25Jan 6, 2021Updated 5 years ago
RUNCSP / RUN-CSP
View on GitHub
☆10Mar 24, 2023Updated 3 years ago
frankaging / Causal-Distill
View on GitHub
The Codebase for Causal Distillation for Language Models (NAACL '22)
☆26May 1, 2022Updated 4 years ago