hetpandya / paraphrase-datasets-pretrained-modelsLinks
A collection of preprocessed datasets and pretrained models for generating paraphrases.
☆31Updated 4 years ago
Alternatives and similar repositories for paraphrase-datasets-pretrained-models
Users that are interested in paraphrase-datasets-pretrained-models are comparing it to the libraries listed below
Sorting:
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆99Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆31Updated last month
- This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The tex…☆55Updated 5 years ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated 2 years ago
- Abstractive and Extractive Text summarization using Transformers.☆86Updated 2 years ago
- NTREX -- News Test References for MT Evaluation☆85Updated last year
- This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalenc…☆56Updated last year
- Text2Text Language Modeling Toolkit☆302Updated 9 months ago
- NewsQuizQA is a quiz-style question-answer dataset used for generating quiz questions about the news☆35Updated 4 years ago
- ☆104Updated 4 years ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆98Updated 2 years ago
- DialogSum: A Real-life Scenario Dialogue Summarization Dataset - Findings of ACL 2021☆182Updated 10 months ago
- MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert…☆49Updated 4 years ago
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆21Updated 2 years ago
- MAFAND-MT☆59Updated last year
- Paraphrase any question with T5 (Text-To-Text Transfer Transformer) - Pretrained model and training script provided☆185Updated 2 years ago
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences fro…☆161Updated last year
- A web application that interfaces two GEC systems. [web instance is down]☆32Updated last year
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆56Updated last year
- The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization☆157Updated 2 years ago
- A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.☆79Updated 4 years ago
- MobileBERT and DistilBERT for extractive summarization☆91Updated 2 years ago
- This repository contains a dataset for hate speech detection on social media platforms.☆74Updated 2 years ago
- Quality Controlled Paraphrase Generation (ACL 2022)☆71Updated last month
- Zero-shot Transfer Learning from English to Arabic☆30Updated 3 years ago
- CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed dat…☆36Updated 4 years ago
- Reduce the size of pretrained Hugging Face models via vocabulary trimming.☆47Updated 2 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆106Updated last year
- Resources for the "CTRLsum: Towards Generic Controllable Text Summarization" paper☆147Updated 5 months ago
- "Unsupervised Paraphrase Generation using Pre-trained Language Model."☆22Updated 5 years ago