bhavnicksm / vanilla-transformer-jaxLinks
JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al. (https://arxiv.org/abs/1706.03762)
โ15Updated 3 years ago
Alternatives and similar repositories for vanilla-transformer-jax
Users that are interested in vanilla-transformer-jax are comparing it to the libraries listed below
Sorting:
- Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*