Kirill-Kravtsov / drophead-pytorch
An implementation of drophead regularization for pytorch transformers
☆19Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for drophead-pytorch
- Implementation of Mixout with PyTorch☆74Updated last year
- Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch☆45Updated 3 years ago
- Implementation of Online Label Smoothing in PyTorch☆94Updated 2 years ago
- My 6th🥇 place solution for Kaggle Shopee competition.☆27Updated 2 years ago
- ☆44Updated 3 years ago
- A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"☆71Updated last year
- Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"☆32Updated 3 years ago
- Minimalistic TensorFlow2+ deep metric/similarity learning library with loss functions, miners, and utils as embedding projector.☆37Updated last year
- Unsupervised Data Augmentation experiments in PyTorch☆59Updated 5 years ago
- Implementation of Multistream Transformers in Pytorch☆53Updated 3 years ago
- What are the best Systems? New Perspectives on NLP Benchmarking☆13Updated last year
- Axial Positional Embedding for Pytorch☆60Updated 3 years ago
- Source code repo for paper "TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation"☆10Updated last year
- label smoothing PyTorch implementation☆30Updated 4 years ago
- ☆28Updated 4 years ago
- a simple pytorch implement of Multi-Sample Dropout☆56Updated 5 years ago
- Solution of Kaggle competition: Feedback Prize - Evaluating Student Writing☆17Updated 2 years ago
- Code for EMNLP 2020 paper CoDIR☆41Updated 2 years ago
- ☆63Updated 2 years ago
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆116Updated 3 years ago
- Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper☆51Updated last year
- This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer …☆55Updated 3 years ago
- An implementation of Transformer with Expire-Span, a circuit for learning which memories to retain☆33Updated 4 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆49Updated 2 years ago
- AAAI 2021: Robustness of Accuracy Metric and its Inspirations in Learning with Noisy Labels☆23Updated 3 years ago
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆72Updated last year
- This is an example program illustrating BERTs masked language model.☆28Updated 4 years ago
- Official code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING…☆25Updated 3 years ago
- [FGVC9-CVPR 2022] The second place solution for 2nd eBay eProduct Visual Search Challenge.☆26Updated 2 years ago
- ☆8Updated 4 years ago