hetpandya / textgenie
A python package to augment text data using NLP.
☆40Updated 2 months ago
Alternatives and similar repositories for textgenie:
Users that are interested in textgenie are comparing it to the libraries listed below
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆102Updated 2 years ago
- The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)☆52Updated 2 years ago
- This repository contains datasets and code for the paper "HINT3: Raising the bar for Intent Detection in the Wild" accepted at EMNLP-2020…☆33Updated 4 years ago
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020☆62Updated 11 months ago
- Code for the paper: Saying No is An Art: Contextualized Fallback Responses for Unanswerable Dialogue Queries☆19Updated 3 years ago
- NewsQuizQA is a quiz-style question-answer dataset used for generating quiz questions about the news☆34Updated 4 years ago
- SUPERT: Unsupervised multi-document summarization evaluation & generation☆94Updated 2 years ago
- LAReQA is a challenging benchmark for evaluating language agnostic answer retrieval from a multilingual candidate pool. This repository c…☆14Updated 4 years ago
- Use BERT to Fill in the Blanks☆82Updated 3 years ago
- FactSumm: Factual Consistency Scorer for Abstractive Summarization☆110Updated last year
- Corresponding code repo for the paper at COLING 2020 - ARGMIN 2020: "DebateSum: A large-scale argument mining and summarization dataset"☆54Updated 3 years ago
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆99Updated 2 years ago
- LongSumm - Scientific Document Summarization Task☆74Updated 2 years ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated 2 years ago
- ☆22Updated 3 years ago
- On Generating Extended Summaries of Long Documents☆78Updated 4 years ago
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.☆34Updated 2 years ago
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.☆39Updated last year
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆80Updated 9 months ago
- Zero-shot Transfer Learning from English to Arabic☆29Updated 2 years ago
- Efficient-Sentence-Embedding-using-Discrete-Cosine-Transform☆17Updated 4 years ago
- Dataset, models, and code for paper "CiteSum: Citation Text-guided Scientific Extreme Summarization and Low-resource Domain Adaptation", …☆33Updated 2 years ago
- Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles☆43Updated 10 months ago
- Benchmarking various Deep Learning models such as BERT, ALBERT, BiLSTMs on the task of sentence entailment using two datasets - MultiNLI …☆28Updated 4 years ago
- A repository for our AAAI-2020 Cross-lingual-NER paper. Code will be updated shortly.☆47Updated 2 years ago
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer models☆65Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing a…☆28Updated 2 years ago
- Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also pred…☆70Updated 3 years ago
- The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.☆37Updated 3 years ago