bhavnicksm / vanilla-transformer-jax
JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al. (https://arxiv.org/abs/1706.03762)
☆12Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for vanilla-transformer-jax
- Perceiver (transformer variant) implemented in JAX and Flax☆11Updated 3 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- ☆57Updated 2 years ago
- A case study of efficient training of large language models using commodity hardware.☆68Updated 2 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆79Updated 9 months ago
- ☆15Updated 4 years ago
- Flax (JAX) implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation☆12Updated 3 years ago
- Repo to the paper "Lie Point Symmetry Data Augmentation for Neural PDE Solvers"☆48Updated last year
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated 5 months ago
- Pytorch-like dataloaders in JAX.☆59Updated last month
- This is a port of Mistral-7B model in JAX☆30Updated 4 months ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)☆185Updated 2 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆47Updated 2 years ago
- HomebrewNLP in JAX flavour for maintable TPU-Training☆46Updated 10 months ago
- Fine-grained, dynamic control of neural network topology in JAX.☆21Updated last year
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆80Updated 11 months ago
- ☆24Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆35Updated 4 months ago
- How to use the Flax Linen API to build a convolutional neural network model and train it for image classification (using TensorFlow Datas…☆24Updated last year
- ☆29Updated 2 months ago
- Running Jax in PyTorch Lightning☆82Updated 2 weeks ago
- A port of muP to JAX/Haiku☆25Updated 2 years ago
- Flexibly track outputs and grad-outputs of torch.nn.Module.☆13Updated last year
- Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.☆22Updated last year
- Implementation of deep implicit attention in PyTorch☆63Updated 3 years ago
- Einsum-like high-level array sharding API for JAX☆32Updated 4 months ago
- Official PyTorch implementation of the Vectorized Conditional Neural Field.☆11Updated 3 months ago