facebookresearch / SentAugment
SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in combination with self-training and knowledge-distillation, or for retrieving paraphrases.
☆361Updated 2 years ago
Alternatives and similar repositories for SentAugment:
Users that are interested in SentAugment are comparing it to the libraries listed below
- ☆344Updated 3 years ago
- Interpretable Evaluation for (Almost) All NLP Tasks☆195Updated 2 years ago
- Awesome Neural Adaptation in Natural Language Processing. A curated list. https://arxiv.org/abs/2006.00632☆265Updated 3 years ago
- Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)☆325Updated last year
- [NAACL 2021] This is the code for our paper `Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self…☆201Updated 2 years ago
- An elaborate and exhaustive paper list for Named Entity Recognition (NER)☆394Updated 2 years ago
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines☆133Updated last year
- Neural Text Generation with Unlikelihood Training☆309Updated 3 years ago
- The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to o…☆380Updated last year
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…☆429Updated 2 years ago
- Interpretable Evaluation for AI Systems☆361Updated last year
- [ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.o…☆604Updated 2 years ago
- With the aim of building next generation virtual assistants that can handle multimodal inputs and perform multimodal actions, we introduc…☆131Updated last year
- A Corpus for Multilingual Document Classification in Eight Languages.☆151Updated 2 years ago
- Summarization Task using Bart and T5 models.☆169Updated 4 years ago
- SummVis is an interactive visualization tool for text summarization.☆251Updated 2 years ago
- Pre-Trained Models for ToD-BERT☆291Updated last year
- [EMNLP 2021] Improving and Simplifying Pattern Exploiting Training☆154Updated 2 years ago
- New dataset☆300Updated 3 years ago
- Officially supported AllenNLP models☆534Updated 2 years ago
- architectures and pre-trained models for long document classification.☆154Updated 4 years ago
- This repository contains the code for "Generating Datasets with Pretrained Language Models".