ZhuiyiTechnology / roformerLinks
Rotary Transformer
☆974Updated 3 years ago
Alternatives and similar repositories for roformer
Users that are interested in roformer are comparing it to the libraries listed below
Sorting:
- RoFormer V1 & V2 pytorch☆502Updated 3 years ago
- A fast MoE impl for PyTorch☆1,757Updated 5 months ago
- Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"☆366Updated last year
- SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.☆1,083Updated 6 months ago
- ☆879Updated last year
- PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538☆1,134Updated last year
- Code for the ALiBi method for transformer language models (ICLR 2022)☆536Updated last year
- A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models☆776Updated last year
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆707Updated this week
- A plug-and-play library for parameter-efficient-tuning (Delta Tuning)☆1,030Updated 9 months ago
- Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4☆851Updated last week
- Rectified Rotary Position Embeddings☆374Updated last year
- real Transformer TeraFLOPS on various GPUs