☆27Jul 9, 2024Updated last year
Alternatives and similar repositories for state-space-models
Users that are interested in state-space-models are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- gzip Predicts Data-dependent Scaling Laws☆34May 28, 2024Updated last year
- Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"☆14May 26, 2025Updated 10 months ago
- Official Project Page for HLA: Higher-order Linear Attention (https://arxiv.org/abs/2510.27258)☆45Jan 6, 2026Updated 2 months ago
- Implementation of Spectral State Space Models☆16Feb 23, 2024Updated 2 years ago
- Grokking on modular arithmetic in less than 150 epochs in MLX☆16Oct 24, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Focused on fast experimentation and simplicity☆80Dec 24, 2024Updated last year
- [PNAS'18] Recurrent computations for visual pattern completion: Classification of occluded images in humans and recurrent neural networks☆19Sep 11, 2018Updated 7 years ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- ☆306Jul 15, 2024Updated last year
- A PyTorch implementation of Knowledge Graph Embedding by Normalizing Flows.☆10Nov 22, 2022Updated 3 years ago
- ☆18Mar 18, 2024Updated 2 years ago
- ☆124Feb 21, 2025Updated last year
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32May 25, 2024Updated last year
- Linear Attention Sequence Parallelism (LASP)☆88Jun 4, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated 10 months ago
- A star for organising blocks and playing with transformers.☆23Apr 28, 2024Updated last year
- ☆13Apr 25, 2025Updated 11 months ago
- Reasoning-based Evaluation and Ranking of Translations.☆20Jul 18, 2025Updated 8 months ago
- manipulating cointegrated pairs to achieve a market-neutral strategy that outperforms indices☆12Jan 12, 2021Updated 5 years ago
- A repo to do interpretability of pre-trained acoustic models☆15Oct 15, 2023Updated 2 years ago
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆52Oct 30, 2025Updated 4 months ago
- ☆14Mar 31, 2024Updated last year
- Official implementation of Data Contamination Can Cross Language Barriers☆12Sep 11, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A platform aimed at creating websites that perform self-optimization☆12May 4, 2024Updated last year
- End to End Machine Learning Pipeline with scikit learn☆12Mar 10, 2021Updated 5 years ago
- ☆45Nov 1, 2025Updated 4 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Apr 22, 2025Updated 11 months ago
- coloring terminal text with intensities (used for plotting probability, entropy with tokens)☆12Oct 11, 2024Updated last year
- A few models converted from caffe to CoreMLs format.☆15Jun 6, 2017Updated 8 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆93Jan 25, 2024Updated 2 years ago
- [ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts☆17Mar 11, 2025Updated last year
- Gradient descent is cool and all, but what if we could delete it?☆106Aug 20, 2025Updated 7 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- High-performance tokenized language data-loader for Python C++ extension☆14Jul 22, 2024Updated last year
- Combining SOAP and MUON☆19Feb 11, 2025Updated last year
- Champion at Brainhack TIL 2023: Team 10000SGDMRT☆18May 29, 2024Updated last year
- Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.☆44Sep 6, 2023Updated 2 years ago
- Ultimate NLP Toolkit for GPUs: RAPIDS-AI, PyTorch, NeMo, Tensorboard, TensorRT, CUDA 10.1☆10Mar 19, 2020Updated 6 years ago
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆31May 29, 2023Updated 2 years ago