lucidrains / minGRU-pytorchLinks
Implementation of the proposed minGRU in Pytorch
☆302Updated 5 months ago
Alternatives and similar repositories for minGRU-pytorch
Users that are interested in minGRU-pytorch are comparing it to the libraries listed below
Sorting:
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆119Updated 9 months ago
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆89Updated 2 months ago
- Pytorch implementation of the xLSTM model by Beck et al. (2024)☆171Updated last year
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆104Updated this week
- Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)☆77Updated last year
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆289Updated 2 months ago
- The AdEMAMix Optimizer: Better, Faster, Older.☆185Updated 11 months ago
- A State-Space Model with Rational Transfer Function Representation.☆79Updated last year
- Implementation of GateLoop Transformer in Pytorch and Jax☆90Updated last year
- ☆298Updated 7 months ago
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆121Updated 10 months ago
- ☆207Updated 8 months ago
- Implementation of Agent Attention in Pytorch