emalach/LinearLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/emalach/LinearLM)

emalach / LinearLM

Code for the paper: https://arxiv.org/pdf/2309.06979.pdf

☆21

Alternatives and similar repositories for LinearLM

Users that are interested in LinearLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Doraemonzzz / nanoTransNormer
View on GitHub
☆11Oct 11, 2023Updated 2 years ago
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
dangxingyu / rnn-icrag
View on GitHub
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Apr 17, 2024Updated 2 years ago
glassroom / heinsen_attention
View on GitHub
Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)
☆25Jun 6, 2024Updated 2 years ago
IBM / selective-dense-state-space-model
View on GitHub
Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on …
☆16Sep 18, 2025Updated 10 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
sjelassi / transformers_ssm_copy
View on GitHub
☆40Feb 26, 2024Updated 2 years ago
kazuki-irie / kv-memory-brain
View on GitHub
Official Code Repository for the paper "Key-value memory in the brain"
☆32Feb 25, 2025Updated last year
rycolab / prefix-parsing
View on GitHub
☆14Feb 1, 2024Updated 2 years ago
OpenNLPLab / HGRN2
View on GitHub
HGRN2: Gated Linear RNNs with State Expansion
☆58Aug 20, 2024Updated last year
OpenMOSE / RWKV-Infer
View on GitHub
A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…
☆51Oct 21, 2025Updated 8 months ago
jungokasai / T2R
View on GitHub
☆14Nov 20, 2022Updated 3 years ago
Doraemonzzz / xmixers
View on GitHub
Xmixers: A collection of SOTA efficient token/channel mixers
☆28Sep 4, 2025Updated 10 months ago
berlino / seq_icl
View on GitHub
☆54May 20, 2024Updated 2 years ago
ag1988 / dlr
View on GitHub
The accompanying code for "Simplifying and Understanding State Space Models with Diagonal Linear RNNs" (Ankit Gupta, Harsh Mehta, Jonatha…
☆23Dec 30, 2022Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
maximzubkov / fft-scan
View on GitHub
Efficient PScan implementation in PyTorch
☆17Jan 2, 2024Updated 2 years ago
fla-org / fla-zoo
View on GitHub
Flash-Linear-Attention models beyond language
☆21Aug 28, 2025Updated 10 months ago
Eliyas0007 / Pytorch-Intention
View on GitHub
Unofficial implementation of paper : Exploring the Space of Key-Value-Query Models with Intention
☆12May 24, 2023Updated 3 years ago
iwiwi / epochraft-hf-fsdp
View on GitHub
Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP
☆11Jan 29, 2024Updated 2 years ago
da03 / criticize_text_generation
View on GitHub
A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …
☆12Mar 18, 2023Updated 3 years ago
NonvolatileMemory / flash_attn_gqa
View on GitHub
triton ver of gqa flash attn, based on the tutorial
☆12Aug 4, 2024Updated last year
jenni-ai / T2FW
View on GitHub
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆20Oct 9, 2022Updated 3 years ago
BlinkDL / LinearAttentionArena
View on GitHub
Here we will test various linear attention designs.
☆62Apr 25, 2024Updated 2 years ago
bdusell / nondeterministic-stack-rnn
View on GitHub
Code for the paper "The Surprising Computational Power of Nondeterministic Stack RNNs" (DuSell and Chiang, 2023)
☆20Mar 21, 2024Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
ermongroup / SPN_Variational_Inference
View on GitHub
PyTorch implementation for "Probabilistic Circuits for Variational Inference in Discrete Graphical Models", NeurIPS 2020
☆17Oct 11, 2021Updated 4 years ago
mcoavoux / mtg
View on GitHub
Statistical discontinuous constituent parsing
☆11Feb 15, 2018Updated 8 years ago
EleutherAI / rnngineering
View on GitHub
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆33May 25, 2024Updated 2 years ago
samblouir / birdie
View on GitHub
☆15Jun 8, 2026Updated last month
robert-lieck / RBN
View on GitHub
Recursive Bayesian Networks
☆11May 11, 2025Updated last year
Noahs-ARK / PaLM
View on GitHub
PyTorch implementation for PaLM: A Hybrid Parser and Language Model.
☆10Jan 7, 2020Updated 6 years ago
HazyResearch / based
View on GitHub
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
☆256Jun 6, 2025Updated last year
jemisjoky / umps_code
View on GitHub
u-MPS implementation and experimentation code used in the paper Tensor Networks for Probabilistic Sequence Modeling (https://arxiv.org/ab…
☆19Jul 2, 2020Updated 6 years ago
automl / unlocking_state_tracking
View on GitHub
Expanding linear RNN state-transition matrix eigenvalues to include negatives improves state-tracking tasks and language modeling without…
☆22Mar 15, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
google-deepmind / spectral_ssm
View on GitHub
☆35Apr 12, 2024Updated 2 years ago
luliyucoordinate / cute-flash-attention
View on GitHub
Implement Flash Attention using Cute.
☆108Dec 17, 2024Updated last year
timvieira / vocrf
View on GitHub
Variable-order CRFs with structure learning
☆17Aug 1, 2024Updated last year
AndPotap / einsum-search
View on GitHub
☆34Oct 4, 2024Updated last year
epfml / pam
View on GitHub
☆16Dec 9, 2023Updated 2 years ago
iesl / s-diora
View on GitHub
☆12Jan 29, 2021Updated 5 years ago
deep-spin / sparse-communication
View on GitHub
☆12Mar 7, 2022Updated 4 years ago