dropreg / R-Drop
☆875Updated 10 months ago
Alternatives and similar repositories for R-Drop:
Users that are interested in R-Drop are comparing it to the libraries listed below
- RoFormer V1 & V2 pytorch☆491Updated 2 years ago
- UDA(Unsupervised Data Augmentation) implemented by pytorch☆276Updated 5 years ago
- ☆488Updated last year
- Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer☆541Updated 3 years ago
- Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning" (ICLR 2022)☆523Updated 3 years ago
- Pytorch version of BERT-whitening☆308Updated 3 years ago
- Implementation of some unbalanced loss like focal_loss, dice_loss, DSC Loss, GHM Loss et.al☆259Updated 2 years ago
- Rotary Transformer☆922Updated 3 years ago
- The score code of FastBERT (ACL2020)☆602Updated 3 years ago
- TensorFlow implementation of On the Sentence Embeddings from Pre-trained Language Models (EMNLP 2020)☆532Updated 3 years ago
- Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"☆360Updated last year
- 简单的向量白化改善句向量质量☆484Updated 3 years ago
- ⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).☆310Updated last year
- A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.☆929Updated 2 years ago
- [ACL 2021] LM-BFF: Better Few-shot Fine-tuning of Language Models https://arxiv.org/abs/2012.15723☆727Updated 2 years ago
- The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`☆275Updated last year
- pytorch implementation for Patient Knowledge Distillation for BERT Model Compression☆201Updated 5 years ago
- SimCSE在中文任务上的简单实验☆602Updated last year
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆606Updated 8 months ago
- A Lite Bert For Self-Supervised Learning Language Representations☆715Updated 4 years ago
- Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021☆300Updated last year
- DeLighT: Very Deep and Light-Weight Transformers☆468Updated 4 years ago
- A PyTorch-based knowledge distillation toolkit for natural language processing☆1,640Updated last year
- [EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821☆3,523Updated 5 months ago
- Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve exis…☆250Updated 3 years ago
- Adversarial Training for Natural Language Understanding☆252Updated last year
- ☆251Updated 2 years ago
- 简洁易用版TinyBert:基于Bert进行知识蒸馏的预训练语言模型☆262Updated 4 years ago
- A pytorch &keras implementation and demo of Fastformer.☆187Updated 2 years ago
- Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".☆740Updated last year