Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
☆253Nov 8, 2021Updated 4 years ago
Alternatives and similar repositories for TUPE
Users that are interested in TUPE are comparing it to the libraries listed below
Sorting:
- ☆221Jun 8, 2020Updated 5 years ago
- Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.☆147Jul 26, 2021Updated 4 years ago
- Source code for "Efficient Training of BERT by Progressively Stacking"☆113Jul 3, 2019Updated 6 years ago
- ☆99Jul 7, 2020Updated 5 years ago
- DeLighT: Very Deep and Light-Weight Transformers☆469Oct 16, 2020Updated 5 years ago
- ☆10Aug 15, 2022Updated 3 years ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆102Nov 2, 2020Updated 5 years ago
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…☆433Aug 17, 2022Updated 3 years ago
- ☆254Oct 4, 2022Updated 3 years ago
- MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf☆298Sep 11, 2021Updated 4 years ago
- ☆880May 24, 2024Updated last year
- Understanding the Difficulty of Training Transformers☆332May 31, 2022Updated 3 years ago
- Single Headed Attention RNN - "Stop thinking with your head"☆1,180Nov 27, 2021Updated 4 years ago
- Transformer training code for sequential tasks☆609Sep 14, 2021Updated 4 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Jun 9, 2021Updated 4 years ago
- The codebase for "Group-wise Contrastive Learning for Neural Dialogue Generation" (Cai et al., Findings of EMNLP 2020)☆55Feb 24, 2021Updated 5 years ago
- Code for using and evaluating SpanBERT.☆904Jul 25, 2023Updated 2 years ago
- Implementation of Imputer: Sequence Modelling via Imputation and Dynamic Programming in PyTorch☆58May 3, 2020Updated 5 years ago
- Cascaded Text Generation with Markov Transformers☆130Mar 20, 2023Updated 2 years ago
- The implementation of DeBERTa☆2,197Sep 29, 2023Updated 2 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆48Nov 30, 2021Updated 4 years ago
- Code associated with the Don't Stop Pretraining ACL 2020 paper☆539Nov 15, 2021Updated 4 years ago
- Tracking the progress in non-autoregressive generation (translation, transcription, etc.)☆302Mar 15, 2023Updated 2 years ago
- [ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.o…☆606Jun 15, 2022Updated 3 years ago
- [NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining☆118Jul 25, 2023Updated 2 years ago
- ☆15May 26, 2021Updated 4 years ago
- LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)☆18May 10, 2023Updated 2 years ago
- ☆84Nov 14, 2019Updated 6 years ago
- Code for the RecAdam paper: Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting.☆120Nov 10, 2020Updated 5 years ago
- Longformer: The Long-Document Transformer☆2,188Feb 8, 2023Updated 3 years ago
- ☆20Feb 26, 2021Updated 5 years ago
- [EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821☆3,641Oct 16, 2024Updated last year
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆2,371Mar 23, 2024Updated last year
- ☆41Feb 12, 2019Updated 7 years ago
- Shared repository for open-sourced projects from the Google AI Language team.☆1,752Feb 20, 2026Updated last week
- Multi-Task Deep Neural Networks for Natural Language Understanding☆2,258Mar 7, 2024Updated last year
- MASS: Masked Sequence to Sequence Pre-training for Language Generation☆1,123Nov 28, 2022Updated 3 years ago
- custom cuda kernel for {2, 3}d relative attention with pytorch wrapper☆43May 5, 2020Updated 5 years ago
- [ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question Answering☆44Jun 18, 2022Updated 3 years ago