radarFudan/Curse-of-memory

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/radarFudan/Curse-of-memory)

radarFudan / Curse-of-memory

Curse-of-memory phenomenon of RNNs in sequence modelling

☆19

Alternatives and similar repositories for Curse-of-memory

Users that are interested in Curse-of-memory are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Noahs-ARK / PaLM
View on GitHub
PyTorch implementation for PaLM: A Hybrid Parser and Language Model.
☆10Jan 7, 2020Updated 6 years ago
google-deepmind / spectral_ssm
View on GitHub
☆35Apr 12, 2024Updated 2 years ago
sustcsonglin / flash-linear-rnn
View on GitHub
Implementations of various linear RNN layers using pytorch and triton
☆55Aug 4, 2023Updated 2 years ago
IDSIA / rtrl-elstm
View on GitHub
Official repository for the paper "Exploring the Promise and Limits of Real-Time Recurrent Learning" (ICLR 2024)
☆13Jun 11, 2025Updated last year
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
bojone / rnn
View on GitHub
一些RNN的实现
☆52Mar 29, 2023Updated 3 years ago
HazyResearch / embroid
View on GitHub
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification
☆11Aug 12, 2023Updated 2 years ago
maximzubkov / fft-scan
View on GitHub
Efficient PScan implementation in PyTorch
☆17Jan 2, 2024Updated 2 years ago
OpenNLPLab / HGRN
View on GitHub
[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…
☆68Apr 24, 2024Updated 2 years ago
NVIDIA / HMM_sample_code
View on GitHub
CUDA 12.2 HMM demos
☆21Jul 26, 2024Updated 2 years ago
OscarXZQ / delta_activations
View on GitHub
Official code release for Delta Activations: A Representation for Finetuned Large Language Models
☆20Sep 5, 2025Updated 10 months ago
ahennequ / pytorch-custom-mma
View on GitHub
☆30Oct 3, 2022Updated 3 years ago
amirzandieh / HyperAttention
View on GitHub
Triton Implementation of HyperAttention Algorithm
☆48Dec 11, 2023Updated 2 years ago
sustcsonglin / gated_linear_attention_layer
View on GitHub
☆32Jan 7, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
deep-spin / sparse-communication
View on GitHub
☆12Mar 7, 2022Updated 4 years ago
tk-rusch / unicornn
View on GitHub
Official code for UnICORNN (ICML 2021)
☆28Oct 1, 2021Updated 4 years ago
dangxingyu / rnn-icrag
View on GitHub
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Apr 17, 2024Updated 2 years ago
jenni-ai / T2FW
View on GitHub
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆20Oct 9, 2022Updated 3 years ago
sjelassi / transformers_ssm_copy
View on GitHub
☆40Feb 26, 2024Updated 2 years ago
johnryan465 / pscan
View on GitHub
☆40Jan 5, 2024Updated 2 years ago
EleutherAI / rnngineering
View on GitHub
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆33May 25, 2024Updated 2 years ago
srush / tangent
View on GitHub
Source-to-Source Debuggable Derivatives in Pure Python
☆15Jan 23, 2024Updated 2 years ago
subho406 / agalite
View on GitHub
AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning (Published in TMLR)
☆24Oct 15, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
kazuki-irie / kv-memory-brain
View on GitHub
Official Code Repository for the paper "Key-value memory in the brain"
☆32Feb 25, 2025Updated last year
catid / spectral_ssm
View on GitHub
Implementation of Spectral State Space Models
☆16Feb 23, 2024Updated 2 years ago
yikangshen / megablocks
View on GitHub
☆20May 30, 2024Updated 2 years ago
siyuanseever / llama2Rnn.c
View on GitHub
☆13Apr 15, 2024Updated 2 years ago
allenbai01 / transformers-as-statisticians
View on GitHub
☆36Jul 5, 2023Updated 3 years ago
sc782 / SBM-Transformer
View on GitHub
☆15Feb 23, 2023Updated 3 years ago
proger / nanokitchen
View on GitHub
Parallel Associative Scan for Language Models
☆18Jan 8, 2024Updated 2 years ago
srush / mamba-scans
View on GitHub
Blog post
☆17Feb 16, 2024Updated 2 years ago
john-x-jiang / meta_ssm
View on GitHub
Repository for ICLR 2023 work, "Sequential Latent Variable Models for Few-Shot High-Dimensional Time-Series Forecasting"
☆31Sep 11, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
zhangjiong724 / spectral-RNN
View on GitHub
STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION
☆16Jun 5, 2018Updated 8 years ago
acosharma / elita-transformer
View on GitHub
Official Repository for Efficient Linear-Time Attention Transformers.
☆17Jun 2, 2024Updated 2 years ago
berlino / seq_icl
View on GitHub
☆54May 20, 2024Updated 2 years ago
Lingkai-Kong / so-ebm
View on GitHub
Code for paper: End-to-end Stochastic Optimization with Energy-based Model
☆16Feb 14, 2023Updated 3 years ago
KurochkinAlexey / AntisymmetricRNN
View on GitHub
Python implementation of paper "AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks"
☆15Aug 2, 2019Updated 6 years ago
Doraemonzzz / tnn-pytorch
View on GitHub
☆20Apr 17, 2023Updated 3 years ago
Benjamin-Walker / selective-ssms-and-linear-cdes
View on GitHub
Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)
☆17Jan 7, 2025Updated last year