facebookresearch / SentAugmentLinks
SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in combination with self-training and knowledge-distillation, or for retrieving paraphrases.
☆362Updated 3 years ago
Alternatives and similar repositories for SentAugment
Users that are interested in SentAugment are comparing it to the libraries listed below
Sorting:
- Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)☆330Updated last year
- Interpretable Evaluation for (Almost) All NLP Tasks☆195Updated 2 years ago
- Unsupervised Question answering via Cloze Translation☆219Updated 3 years ago
- New dataset☆306Updated 3 years ago
- The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to o…☆380Updated 2 years ago
- ☆345Updated 4 years ago
- Enhancing the BERT training with Semi-supervised Generative Adversarial Networks☆228Updated 2 years ago
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines☆137Updated last year
- Awesome Neural Adaptation in Natural Language Processing. A curated list. https://arxiv.org/abs/2006.00632☆266Updated 4 years ago
- Python code for various NLP metrics☆168Updated 5 years ago
- Minimalist implementation of a BERT Sentence Classifier with PyTorch Lightning, Transformers and PyTorch-NLP.☆219Updated 2 years ago
- [NAACL 2021] This is the code for our paper `Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self…☆204Updated 3 years ago
- DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue☆283Updated 2 years ago
- Repository for the paper "Optimal Subarchitecture Extraction for BERT"☆472Updated 3 years ago
- Neural Text Generation with Unlikelihood Training☆309Updated 3 years ago
- IEEE/ACM TASLP 2020: SBERT-WK: A Sentence Embedding Method By Dissecting BERT-based Word Models☆180Updated 4 years ago
- With the aim of building next generation virtual assistants that can handle multimodal inputs and perform multimodal actions, we introduc…☆133Updated last year
- EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"☆344Updated 9 months ago
- SummVis is an interactive visualization tool for text summarization.☆253Updated 3 years ago
- architectures and pre-trained models for long document classification.☆155Updated 4 years ago
- Authors' implementation of EMNLP-IJCNLP 2019 paper "Answering Complex Open-domain Questions Through Iterative Query Generation"☆195Updated 5 years ago
- Scripts and links to recreate the ELI5 dataset.☆326Updated 3 years ago
- QED: A Framework and Dataset for Explanations in Question Answering☆117Updated 4 years ago
- This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, an…☆560Updated 3 years ago
- Pre-Trained Models for ToD-BERT☆294Updated 2 years ago
- Fork of huggingface/pytorch-pretrained-BERT for BERT on STILTs☆106Updated 2 years ago
- Question Answering using Albert and Electra☆208Updated 2 years ago
- An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)☆445Updated 2 months ago
- Document ranking via sentence modeling using BERT☆145Updated 2 years ago
- State of the art Semantic Sentence Embeddings☆99Updated 3 years ago