Implementation of https://srush.github.io/annotated-s4
☆513Jun 20, 2025Updated 9 months ago
Alternatives and similar repositories for annotated-s4
Users that are interested in annotated-s4 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Structured state space sequence models☆2,869Jul 17, 2024Updated last year
- ☆317Jan 8, 2025Updated last year
- Annotated version of the Mamba paper☆499Feb 27, 2024Updated 2 years ago
- Paper: Lexicon Learning for Few-Shot Neural Sequence Modeling☆16Jan 8, 2022Updated 4 years ago
- ☆35Nov 22, 2024Updated last year
- Accelerated First Order Parallel Associative Scan☆196Jan 7, 2026Updated 2 months ago
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆118Mar 16, 2024Updated 2 years ago
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆89Mar 1, 2024Updated 2 years ago
- Sequence Modeling with Structured State Spaces☆67Aug 2, 2022Updated 3 years ago
- Following research on S4 in jax☆16Jun 15, 2022Updated 3 years ago
- Recursive Bayesian Networks☆11May 11, 2025Updated 10 months ago
- Reading list for research topics in state-space models☆354Jun 11, 2025Updated 9 months ago
- What would you do with 1000 H100s...☆1,161Jan 10, 2024Updated 2 years ago
- Sequence modeling with Mega.☆303Jan 28, 2023Updated 3 years ago
- ☆164Jan 24, 2023Updated 3 years ago
- ☆29Nov 30, 2021Updated 4 years ago
- Simple, minimal implementation of the Mamba SSM in one file of PyTorch.☆2,929Mar 8, 2024Updated 2 years ago
- maximal update parametrization (µP)☆1,690Jul 17, 2024Updated last year
- Convolutions for Sequence Modeling☆912Jun 13, 2024Updated last year
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆344Dec 28, 2024Updated last year
- Language Modeling with the H3 State Space Model☆522Sep 29, 2023Updated 2 years ago
- ☆51Jan 28, 2024Updated 2 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 3 years ago
- Mamba SSM architecture☆17,524Updated this week
- ☆167Jul 5, 2023Updated 2 years ago
- Train very large language models in Jax.☆210Oct 21, 2023Updated 2 years ago
- ☆39Apr 5, 2024Updated last year
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆250Jun 6, 2025Updated 9 months ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆93Jan 25, 2024Updated 2 years ago
- Non official implementation of the Linear Recurrent Unit (LRU, Orvieto et al. 2023)☆62Sep 3, 2025Updated 6 months ago
- Long Range Arena for Benchmarking Efficient Transformers☆786Dec 16, 2023Updated 2 years ago
- Blog post☆17Feb 16, 2024Updated 2 years ago
- Fast, general, and tested differentiable structured prediction in PyTorch☆1,124Apr 20, 2022Updated 3 years ago
- Silly twitter torch implementations.☆46Oct 14, 2022Updated 3 years ago
- Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/☆2,821Mar 9, 2026Updated last week
- Parallel Associative Scan for Language Models☆18Jan 8, 2024Updated 2 years ago
- A simple and efficient Mamba implementation in pure PyTorch and MLX.☆1,442Jan 26, 2026Updated last month
- ☆10Jun 27, 2024Updated last year
- Some preliminary explorations of Mamba's context scaling.☆218Feb 8, 2024Updated 2 years ago