WailordHe/DenseSSM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WailordHe/DenseSSM)

WailordHe / DenseSSM

A repository for DenseSSMs

☆90

Alternatives and similar repositories for DenseSSM

Users that are interested in DenseSSM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

BlinkDL / LinearAttentionArena
View on GitHub
Here we will test various linear attention designs.
☆62Apr 25, 2024Updated 2 years ago
OpenNLPLab / HGRN
View on GitHub
[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…
☆68Apr 24, 2024Updated 2 years ago
berlino / gated_linear_attention
View on GitHub
☆107Mar 9, 2024Updated 2 years ago
siyuanseever / llama2Rnn.c
View on GitHub
☆13Apr 15, 2024Updated 2 years ago
sustcsonglin / mamba-triton
View on GitHub
☆52Jan 28, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kazuki-irie / kv-memory-brain
View on GitHub
Official Code Repository for the paper "Key-value memory in the brain"
☆32Feb 25, 2025Updated last year
tobiaskatsch / GatedLinearRNN
View on GitHub
☆30Feb 27, 2024Updated 2 years ago
00ffcc / chunkRWKV6
View on GitHub
continous batching and parallel acceleration for RWKV6
☆23Jun 28, 2024Updated 2 years ago
AlirezaMorsali / MLP-Attention
View on GitHub
☆17Dec 19, 2024Updated last year
YuchuanTian / RethinkTinyLM
View on GitHub
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
☆126Jan 14, 2025Updated last year
ggjy / vision_weak_to_strong
View on GitHub
☆38Feb 8, 2024Updated 2 years ago
lxxue / prefix_sum
View on GitHub
A PyTorch wrapper of parallel exclusive scan in CUDA
☆12May 25, 2023Updated 3 years ago
dangxingyu / rnn-icrag
View on GitHub
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Apr 17, 2024Updated 2 years ago
glassroom / heinsen_attention
View on GitHub
Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)
☆25Jun 6, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
EleutherAI / rnngineering
View on GitHub
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆33May 25, 2024Updated 2 years ago
AmeenAli / HiddenMambaAttn
View on GitHub
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
☆234Oct 16, 2025Updated 9 months ago
proger / hippogriff
View on GitHub
Griffin MQA + Hawk Linear RNN Hybrid
☆89Apr 13, 2026Updated 3 months ago
krafton-ai / mambaformer-icl
View on GitHub
MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248
☆63Jun 18, 2024Updated 2 years ago
ZhengYu518 / VL-Mamba
View on GitHub
Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"
☆86Mar 21, 2024Updated 2 years ago
assafbk / DeciMamba
View on GitHub
DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)
☆32Apr 9, 2025Updated last year
Zyphra / BlackMamba
View on GitHub
Code repository for Black Mamba
☆265Feb 8, 2024Updated 2 years ago
samblouir / birdie
View on GitHub
☆15Jun 8, 2026Updated last month
OpenNLPLab / HGRN2
View on GitHub
HGRN2: Gated Linear RNNs with State Expansion
☆58Aug 20, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
facebookresearch / spartan
View on GitHub
Spartan is an algorithm for training sparse neural network models. This repository accompanies the paper "Spartan Differentiable Sparsity…
☆26Oct 31, 2022Updated 3 years ago
menik1126 / Swing-Bench
View on GitHub
[ICLR2026🔥Oral] SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
☆15Feb 26, 2026Updated 5 months ago
kyegomez / MambaByte
View on GitHub
Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta
☆128Jul 20, 2026Updated last week
aiha-lab / TSLD
View on GitHub
[NeurIPS 2023] Token-Scaled Logit Distillation for Ternary Weight Generative Language Models
☆18Dec 6, 2023Updated 2 years ago
fsssosei / similarity_index_of_label_graph
View on GitHub
This is the package used to calculate the similarity index of the label graph pairs.
☆13Nov 4, 2020Updated 5 years ago
zhaoxlpku / SubgoalXL
View on GitHub
☆26Aug 23, 2024Updated last year
GindaChen / FlexFlashAttention3
View on GitHub
FlexAttention w/ FlashAttention3 Support
☆27Oct 5, 2024Updated last year
radarFudan / Awesome-state-space-models
View on GitHub
Collection of papers on state-space models
☆620Nov 4, 2025Updated 8 months ago
renll / SeqBoat
View on GitHub
[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling
☆40Dec 2, 2023Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
automl / DeltaProduct
View on GitHub
DeltaProduct is a new linear recurrent neural network architecture that uses products of generalized Householder matrices as state-transi…
☆15Oct 13, 2025Updated 9 months ago
YuchuanTian / DiJiang
View on GitHub
[ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear at…
☆103Jun 14, 2024Updated 2 years ago
NJUDeepEngine / CAEF
View on GitHub
Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"
☆11Oct 11, 2024Updated last year
yikangshen / megablocks
View on GitHub
☆20May 30, 2024Updated 2 years ago
syncdoth / RetNet
View on GitHub
Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent,…
☆227Mar 12, 2024Updated 2 years ago
google-deepmind / spectral_ssm
View on GitHub
☆35Apr 12, 2024Updated 2 years ago
microsoft / EfficientLongSequenceModeling
View on GitHub
☆54Jan 19, 2023Updated 3 years ago