lukemelas / simple-bertLinks
A simple PyTorch implementation of BERT, complete with pretrained models and training scripts.
☆43Updated 6 years ago
Alternatives and similar repositories for simple-bert
Users that are interested in simple-bert are comparing it to the libraries listed below
Sorting:
- A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering☆43Updated 4 years ago
- Implementation of Kronecker Attention in Pytorch☆19Updated 4 years ago
- An open source implementation of CLIP.☆32Updated 2 years ago
- Implementation of the Remixer Block from the Remixer paper, in Pytorch☆36Updated 3 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆50Updated 3 years ago
- MTAdam: Automatic Balancing of Multiple Training Loss Terms☆36Updated 4 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆49Updated 3 years ago
- ☆24Updated 4 years ago
- Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch☆59Updated 4 years ago
- A non-JIT version implementation / replication of CLIP of OpenAI in pytorch☆34Updated 4 years ago
- ☆24Updated 3 years ago
- A deep learning library based on Pytorch focussed on low resource language research and robustness☆70Updated 3 years ago
- Tensorflow implementation of cosine normalization☆6Updated 5 years ago
- [NeurIPS 2022] DataMUX: Data Multiplexing for Neural Networks☆60Updated 2 years ago
- ☆20Updated 3 years ago
- [ICLR 2021] Beyond Categorical Label Representations for Image Classification☆25Updated 3 years ago
- Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch☆58Updated 4 years ago
- ☆21Updated 4 years ago
- Code for reversible recurrent neural networks☆39Updated 6 years ago
- ☆24Updated last year
- ☆41Updated 2 years ago
- PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)☆33Updated 2 years ago
- A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.☆48Updated 11 months ago
- Transfer Learning via Unsupervised Task Discovery for Visual Question Answering☆31Updated 6 years ago
- (ICML 2021) Implementation for S2SD - Simultaneous Similarity-based Self-Distillation for Deep Metric Learning. Paper Link: https://arxiv…☆43Updated 4 years ago
- A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses and Loggers to better integrate pytorch-lightning with transfor…☆47Updated 2 years ago
- PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]☆56Updated 3 years ago
- Implementation of Multistream Transformers in Pytorch☆54Updated 3 years ago
- Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"☆32Updated 4 years ago
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆119Updated 3 years ago