microsoft / SambaLinks
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆881Updated last month
Alternatives and similar repositories for Samba
Users that are interested in Samba are comparing it to the libraries listed below
Sorting:
- Code for BLT research paper☆1,686Updated last month
- Minimalistic large language model 3D-parallelism training☆1,926Updated last week
- Muon: An optimizer for hidden layers in neural networks☆897Updated last week
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆519Updated last month
- [ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters☆562Updated 4 months ago
- Large Context Attention☆716Updated 4 months ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆731Updated 8 months ago
- ☆178Updated 6 months ago
- A repository for research on medium sized language models.☆498Updated 2 weeks ago
- Recipes to scale inference-time compute of open models☆1,095Updated last month
- Helpful tools and examples for working with flex-attention☆831Updated last week
- Pretraining code for a large-scale depth-recurrent language model