lucaslingle/transformer_vq

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lucaslingle/transformer_vq)

lucaslingle / transformer_vq

Official implementation of 'Transformer-VQ: Linear-Time Transformers via Vector Quantization' (ICLR 2024)

☆199

Alternatives and similar repositories for transformer_vq

Users that are interested in transformer_vq are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sustcsonglin / disco-pointer
View on GitHub
Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span …
☆14Aug 25, 2023Updated 2 years ago
jungokasai / T2R
View on GitHub
☆14Nov 20, 2022Updated 3 years ago
berlino / seq_icl
View on GitHub
☆54May 20, 2024Updated 2 years ago
Noahs-ARK / PaLM
View on GitHub
PyTorch implementation for PaLM: A Hybrid Parser and Language Model.
☆10Jan 7, 2020Updated 6 years ago
berlino / gated_linear_attention
View on GitHub
☆107Mar 9, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Doraemonzzz / xmixers
View on GitHub
Xmixers: A collection of SOTA efficient token/channel mixers
☆29Sep 4, 2025Updated 10 months ago
youngsheen / SimVQ
View on GitHub
[ICCV 2025] SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
☆328Dec 29, 2024Updated last year
jenni-ai / T2FW
View on GitHub
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆20Oct 9, 2022Updated 3 years ago
dangxingyu / rnn-icrag
View on GitHub
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Apr 17, 2024Updated 2 years ago
emorynlp / seq2seq-corenlp
View on GitHub
☆13Feb 7, 2023Updated 3 years ago
BlinkDL / LinearAttentionArena
View on GitHub
Here we will test various linear attention designs.
☆62Apr 25, 2024Updated 2 years ago
tml-epfl / why-weight-decay
View on GitHub
Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]
☆73Sep 25, 2024Updated last year
VITA-Group / Data-Efficient-Scaling
View on GitHub
[ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang
☆14Jan 4, 2024Updated 2 years ago
OpenNLPLab / LASP
View on GitHub
Linear Attention Sequence Parallelism (LASP)
☆87Jun 4, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
amirzandieh / HyperAttention
View on GitHub
Triton Implementation of HyperAttention Algorithm
☆48Dec 11, 2023Updated 2 years ago
rycolab / aflt-f2023
View on GitHub
Advanced Formal Language Theory (263-5352-00L; Frühjahr 2023)
☆10Feb 21, 2023Updated 3 years ago
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
renll / SeqBoat
View on GitHub
[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling
☆40Dec 2, 2023Updated 2 years ago
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
lucidrains / FLASH-pytorch
View on GitHub
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"
☆372Sep 26, 2023Updated 2 years ago
HazyResearch / based
View on GitHub
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
☆256Jun 6, 2025Updated last year
jzhang38 / LongMamba
View on GitHub
Some preliminary explorations of Mamba's context scaling.
☆221Feb 8, 2024Updated 2 years ago
lucidrains / gateloop-transformer
View on GitHub
Implementation of GateLoop Transformer in Pytorch and Jax
☆93Jun 18, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zsLin177 / SRL-as-GP
View on GitHub
☆18Mar 10, 2023Updated 3 years ago
iankur / vqllm
View on GitHub
Residual vector quantization for KV cache compression in large language model
☆12Oct 22, 2024Updated last year
whyNLP / Probabilistic-Transformer
View on GitHub
A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.
☆26Oct 22, 2023Updated 2 years ago
eamartin / parallelizing_linear_rnns
View on GitHub
☆45Apr 30, 2018Updated 8 years ago
radarFudan / mamba-minimal-jax
View on GitHub
☆36Nov 22, 2024Updated last year
lucidrains / taylor-series-linear-attention
View on GitHub
Explorations into the recently proposed Taylor Series Linear Attention
☆101Aug 18, 2024Updated last year
sustcsonglin / mamba-triton
View on GitHub
☆52Jan 28, 2024Updated 2 years ago
FranxYao / Long-Context-Data-Engineering
View on GitHub
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆502Mar 19, 2024Updated 2 years ago
irhum / hyena
View on GitHub
JAX/Flax implementation of the Hyena Hierarchy
☆35Apr 27, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
glassroom / heinsen_attention
View on GitHub
Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)
☆25Jun 6, 2024Updated 2 years ago
radarFudan / Curse-of-memory
View on GitHub
Curse-of-memory phenomenon of RNNs in sequence modelling
☆19May 8, 2025Updated last year
acosharma / elita-transformer
View on GitHub
Official Repository for Efficient Linear-Time Attention Transformers.
☆18Jun 2, 2024Updated 2 years ago
fla-org / flash-bidirectional-linear-attention
View on GitHub
Triton implement of bi-directional (non-causal) linear attention
☆78Mar 1, 2026Updated 4 months ago
bojone / FSQ
View on GitHub
Keras implement of Finite Scalar Quantization
☆87Oct 31, 2023Updated 2 years ago
yikangshen / megablocks
View on GitHub
☆20May 30, 2024Updated 2 years ago
robert-lieck / RBN
View on GitHub
Recursive Bayesian Networks
☆11May 11, 2025Updated last year