Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
☆252Nov 8, 2021Updated 4 years ago
Alternatives and similar repositories for TUPE
Users that are interested in TUPE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆221Jun 8, 2020Updated 5 years ago
- Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.☆147Jul 26, 2021Updated 4 years ago
- Source code for "Efficient Training of BERT by Progressively Stacking"☆112Jul 3, 2019Updated 6 years ago
- ☆99Jul 7, 2020Updated 5 years ago
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…☆433Aug 17, 2022Updated 3 years ago
- DeLighT: Very Deep and Light-Weight Transformers☆469Oct 16, 2020Updated 5 years ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆102Nov 2, 2020Updated 5 years ago
- MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf☆297Sep 11, 2021Updated 4 years ago
- ☆255Oct 4, 2022Updated 3 years ago
- ☆880May 24, 2024Updated last year
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Jun 9, 2021Updated 4 years ago
- Language model with phrase induction☆14Jun 13, 2019Updated 6 years ago
- ☆20Feb 26, 2021Updated 5 years ago
- The implementation of DeBERTa☆2,202Sep 29, 2023Updated 2 years ago
- Transformer training code for sequential tasks☆609Sep 14, 2021Updated 4 years ago
- Cascaded Text Generation with Markov Transformers☆130Mar 20, 2023Updated 3 years ago
- ☆15May 26, 2021Updated 4 years ago
- Code for using and evaluating SpanBERT.☆906Jul 25, 2023Updated 2 years ago
- Tracking the progress in non-autoregressive generation (translation, transcription, etc.)☆302Mar 15, 2023Updated 3 years ago
- codes and pre-trained models of paper "Segatron: Segment-aware Transformer for Language Modeling and Understanding"☆18Oct 25, 2022Updated 3 years ago
- Single Headed Attention RNN - "Stop thinking with your head"☆1,181Nov 27, 2021Updated 4 years ago
- Understanding the Difficulty of Training Transformers☆332May 31, 2022Updated 3 years ago
- ☆84Nov 14, 2019Updated 6 years ago
- Solution of KDD cup 2021☆11Jun 16, 2021Updated 4 years ago
- The implementation of the paper "Harvesting and Refining Question-Answer Pairs for Unsupervised QA"☆33Nov 25, 2020Updated 5 years ago
- Code associated with the Don't Stop Pretraining ACL 2020 paper☆540Nov 15, 2021Updated 4 years ago
- custom cuda kernel for {2, 3}d relative attention with pytorch wrapper☆43May 5, 2020Updated 5 years ago
- The codebase for "Group-wise Contrastive Learning for Neural Dialogue Generation" (Cai et al., Findings of EMNLP 2020)☆55Feb 24, 2021Updated 5 years ago
- Code for ACL2020 "Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation"☆39Jun 24, 2020Updated 5 years ago
- Code for our SIGIR 2021 short paper "Lighter and Better: Low-Rank Decomposed Self-Attention Networks for Next-Item Recommendation."☆15May 5, 2021Updated 4 years ago
- Official repository of the R2-D2's pipeline☆21Nov 16, 2021Updated 4 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Aug 6, 2023Updated 2 years ago
- Study of Pre-Trained Positional Embeddings☆16Nov 6, 2020Updated 5 years ago
- Implementation of paper "Parallelizable Stack Long Short-Term Memory"☆12Apr 8, 2019Updated 6 years ago
- Code for the RecAdam paper: Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting.☆119Nov 10, 2020Updated 5 years ago
- Implementation of Imputer: Sequence Modelling via Imputation and Dynamic Programming in PyTorch☆58May 3, 2020Updated 5 years ago
- Shared repository for open-sourced projects from the Google AI Language team.☆1,760Mar 17, 2026Updated last week
- [EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821☆3,643Oct 16, 2024Updated last year
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆48Nov 30, 2021Updated 4 years ago