assafbk/DeciMamba

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/assafbk/DeciMamba)

assafbk / DeciMamba

DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)

☆32

Alternatives and similar repositories for DeciMamba

Users that are interested in DeciMamba are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

proger / nanokitchen
View on GitHub
Parallel Associative Scan for Language Models
☆18Jan 8, 2024Updated 2 years ago
sjelassi / transformers_ssm_copy
View on GitHub
☆40Feb 26, 2024Updated 2 years ago
jzhang38 / LongMamba
View on GitHub
Some preliminary explorations of Mamba's context scaling.
☆221Feb 8, 2024Updated 2 years ago
maximzubkov / fft-scan
View on GitHub
Efficient PScan implementation in PyTorch
☆17Jan 2, 2024Updated 2 years ago
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
berlino / seq_icl
View on GitHub
☆54May 20, 2024Updated 2 years ago
nanowell / Q-Sparse-LLM
View on GitHub
My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
☆37Aug 14, 2024Updated last year
assafbk / OPRM
View on GitHub
Overflow Prevention Enhances Long-Context Recurrent LLMs (COLM 2025)
☆18Jul 8, 2025Updated last year
Noahs-ARK / PaLM
View on GitHub
PyTorch implementation for PaLM: A Hybrid Parser and Language Model.
☆10Jan 7, 2020Updated 6 years ago
DoctorKey / Practise
View on GitHub
[CVPR2023] Practical Network Acceleration with Tiny Sets
☆13Jul 28, 2023Updated 2 years ago
recursal / GoldFinch-paper
View on GitHub
GoldFinch and other hybrid transformer components
☆46Jul 20, 2024Updated 2 years ago
tyyyang / MemDTC
View on GitHub
Code for "Visual Tracking via Dynamic Memory Networks"
☆12Jul 18, 2019Updated 7 years ago
ethansmith2000 / TransformerExperiments
View on GitHub
☆19Dec 4, 2025Updated 7 months ago
AmeenAli / HiddenMambaAttn
View on GitHub
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
☆234Oct 16, 2025Updated 9 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
XuezheMax / gecko-llm
View on GitHub
Gecko Architecture
☆16Jan 13, 2026Updated 6 months ago
srush / tangent
View on GitHub
Source-to-Source Debuggable Derivatives in Pure Python
☆15Jan 23, 2024Updated 2 years ago
Hannibal046 / GridTST
View on GitHub
Source code for Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers
☆19May 29, 2024Updated 2 years ago
Zyphra / Zamba2
View on GitHub
PyTorch implementation of models from the Zamba2 series.
☆193Jan 23, 2025Updated last year
jxiw / MambaInLlama
View on GitHub
[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models
☆243Oct 14, 2025Updated 9 months ago
rycolab / aflt-f2023
View on GitHub
Advanced Formal Language Theory (263-5352-00L; Frühjahr 2023)
☆10Feb 21, 2023Updated 3 years ago
Cranial-XIX / longhorn
View on GitHub
Official PyTorch Implementation of the Longhorn Deep State Space Model
☆57Dec 4, 2024Updated last year
emalach / LinearLM
View on GitHub
Code for the paper: https://arxiv.org/pdf/2309.06979.pdf
☆21Jul 29, 2024Updated last year
phiphiphi31 / DualTFR
View on GitHub
☆18Jun 26, 2023Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
sustcsonglin / mamba-triton
View on GitHub
☆52Jan 28, 2024Updated 2 years ago
Leooyii / LCEG
View on GitHub
[COLM'25] A Controlled Study on Long Context Extension and Generalization in LLMs
☆65Mar 9, 2026Updated 4 months ago
HazyResearch / prefix-linear-attention
View on GitHub
☆62Jul 9, 2024Updated 2 years ago
yikangshen / megablocks
View on GitHub
☆20May 30, 2024Updated 2 years ago
nick7nlp / Counting-Stars
View on GitHub
Counting-Stars (★)
☆83Nov 24, 2025Updated 7 months ago
zhangjiong724 / spectral-RNN
View on GitHub
STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION
☆16Jun 5, 2018Updated 8 years ago
WailordHe / DenseSSM
View on GitHub
A repository for DenseSSMs
☆90Apr 11, 2024Updated 2 years ago
albertfgu / awesome-ssm-ml
View on GitHub
☆14May 30, 2024Updated 2 years ago
Raphaaal / fieldy
View on GitHub
Fine-grained attention in hierarchical transformers for tabular time-series.
☆12Dec 24, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
stephenkyang / mean-reversion-pairs-trading
View on GitHub
manipulating cointegrated pairs to achieve a market-neutral strategy that outperforms indices
☆11Jan 12, 2021Updated 5 years ago
NUS-HPC-AI-Lab / R-MeeTo
View on GitHub
Give us minutes, we give back a faster Mamba. The official implementation of "Faster Vision Mamba is Rebuilt in Minutes via Merged Token …
☆39Dec 18, 2024Updated last year
TaiMingLu / know-dont-tell
View on GitHub
☆19Oct 14, 2024Updated last year
BBuf / flash-rwkv
View on GitHub
☆32May 26, 2024Updated 2 years ago
henryzhongsc / longctx_bench
View on GitHub
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches. EMNLP Findings 2024
☆89Feb 27, 2025Updated last year
acosharma / elita-transformer
View on GitHub
Official Repository for Efficient Linear-Time Attention Transformers.
☆18Jun 2, 2024Updated 2 years ago
eamartin / parallelizing_linear_rnns
View on GitHub
☆45Apr 30, 2018Updated 8 years ago