krafton-ai / mambaformer-iclLinks
MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248
☆54Updated 11 months ago
Alternatives and similar repositories for mambaformer-icl
Users that are interested in mambaformer-icl are comparing it to the libraries listed below
Sorting:
- ☆31Updated last year
- ☆23Updated 8 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 9 months ago
- Stick-breaking attention☆56Updated 2 months ago
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆28Updated last month
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆38Updated 7 months ago
- [ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation☆42Updated 3 months ago
- ☆54Updated 10 months ago
- [ICLR 2025] Official PyTorch implementation of "Forgetting Transformer: Softmax Attention with a Forget Gate"☆107Updated 3 weeks ago
- Here we will test various linear attention designs.