Jamie-Stirling / RetNetLinks
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
☆1,205Updated 2 years ago
Alternatives and similar repositories for RetNet
Users that are interested in RetNet are comparing it to the libraries listed below
Sorting:
- Meta-Transformer for Unified Multimodal Learning☆1,634Updated last year
- Foundation Architecture for (M)LLMs☆3,119Updated last year
- Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent,…☆226Updated last year
- Simple, minimal implementation of the Mamba SSM in one file of PyTorch.☆2,878Updated last year
- A simple and efficient Mamba implementation in pure PyTorch and MLX.☆1,353Updated 10 months ago
- Structured state space sequence models☆2,760Updated last year
- The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”☆977Updated last year
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch