machine-discovery/deer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/machine-discovery/deer)

machine-discovery / deer

Parallelizing non-linear sequential models over the sequence length

☆57

Alternatives and similar repositories for deer

Users that are interested in deer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lindermanlab / elk
View on GitHub
Scalable and Stable Parallelization of Nonlinear RNNS
☆33Jun 28, 2026Updated 3 weeks ago
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
Doraemonzzz / hgru-pytorch
View on GitHub
☆29Jul 9, 2024Updated 2 years ago
OpenNLPLab / HGRN
View on GitHub
[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…
☆68Apr 24, 2024Updated 2 years ago
yikangshen / megablocks
View on GitHub
☆20May 30, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
proger / nanokitchen
View on GitHub
Parallel Associative Scan for Language Models
☆18Jan 8, 2024Updated 2 years ago
IDSIA / rtrl-elstm
View on GitHub
Official repository for the paper "Exploring the Promise and Limits of Real-Time Recurrent Learning" (ICLR 2024)
☆13Jun 11, 2025Updated last year
eamartin / parallelizing_linear_rnns
View on GitHub
☆45Apr 30, 2018Updated 8 years ago
irhum / hyena
View on GitHub
JAX/Flax implementation of the Hyena Hierarchy
☆35Apr 27, 2023Updated 3 years ago
subho406 / agalite
View on GitHub
AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning (Published in TMLR)
☆24Oct 15, 2024Updated last year
dangxingyu / rnn-icrag
View on GitHub
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Apr 17, 2024Updated 2 years ago
amirzandieh / HyperAttention
View on GitHub
Triton Implementation of HyperAttention Algorithm
☆48Dec 11, 2023Updated 2 years ago
OpenNLPLab / HGRN2
View on GitHub
HGRN2: Gated Linear RNNs with State Expansion
☆58Aug 20, 2024Updated last year
JRC1995 / Continuous-RvNN
View on GitHub
Official Repository for "Modeling Hierarchical Structures with Continuous Recursive Neural Networks" (ICML 2021)
☆12Aug 18, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
berlino / gated_linear_attention
View on GitHub
☆107Mar 9, 2024Updated 2 years ago
sustcsonglin / mamba-triton
View on GitHub
☆52Jan 28, 2024Updated 2 years ago
shawntan / stickbreaking-attention
View on GitHub
Stick-breaking attention
☆63Jul 1, 2025Updated last year
lxxue / prefix_sum
View on GitHub
A PyTorch wrapper of parallel exclusive scan in CUDA
☆12May 25, 2023Updated 3 years ago
whyNLP / Probabilistic-Transformer
View on GitHub
A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.
☆26Oct 22, 2023Updated 2 years ago
lsj2408 / URPE
View on GitHub
[NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)
☆35Aug 6, 2023Updated 2 years ago
zeyuliu1037 / LMUFormer
View on GitHub
ICLR 2024 LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units
☆13Sep 20, 2024Updated last year
NicolasZucchet / minimal-LRU
View on GitHub
Non official implementation of the Linear Recurrent Unit (LRU, Orvieto et al. 2023)
☆62Sep 3, 2025Updated 10 months ago
zhangjiong724 / spectral-RNN
View on GitHub
STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION
☆16Jun 5, 2018Updated 8 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
HazyResearch / zoology
View on GitHub
Understand and test language model architectures on synthetic tasks.
☆277Mar 22, 2026Updated 4 months ago
open-lm-engine / lm-engine
View on GitHub
LM engine is a library for pretraining/finetuning LLMs
☆183Updated this week
lucidrains / gateloop-transformer
View on GitHub
Implementation of GateLoop Transformer in Pytorch and Jax
☆92Jun 18, 2024Updated 2 years ago
glassroom / heinsen_sequence
View on GitHub
Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)
☆98Dec 5, 2024Updated last year
sustcsonglin / flash-linear-rnn
View on GitHub
Implementations of various linear RNN layers using pytorch and triton
☆55Aug 4, 2023Updated 2 years ago
lindermanlab / S5
View on GitHub
☆324Jan 8, 2025Updated last year
symoon11 / dreamerv3-flax
View on GitHub
Flax Implementation of DreamerV3 on Crafter
☆18Nov 29, 2025Updated 7 months ago
bstadie / krazyworld
View on GitHub
krazy grid world
☆26Mar 2, 2020Updated 6 years ago
EleutherAI / rnngineering
View on GitHub
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆33May 25, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
proger / accelerated-scan
View on GitHub
Accelerated First Order Parallel Associative Scan
☆198Jan 7, 2026Updated 6 months ago
Doraemonzzz / xmixers
View on GitHub
Xmixers: A collection of SOTA efficient token/channel mixers
☆28Sep 4, 2025Updated 10 months ago
radarFudan / mamba-minimal-jax
View on GitHub
☆36Nov 22, 2024Updated last year
maximzubkov / fft-scan
View on GitHub
Efficient PScan implementation in PyTorch
☆17Jan 2, 2024Updated 2 years ago
OpenNLPLab / Tnn
View on GitHub
[ICLR 2023] Official implementation of Transnormer in our ICLR 2023 paper - Toeplitz Neural Network for Sequence Modeling
☆80Apr 24, 2024Updated 2 years ago
radarFudan / Curse-of-memory
View on GitHub
Curse-of-memory phenomenon of RNNs in sequence modelling
☆19May 8, 2025Updated last year
McGill-NLP / length-generalization
View on GitHub
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
☆139Apr 30, 2024Updated 2 years ago