hetpandya / textgenie
A python package to augment text data using NLP.
☆40Updated last month
Alternatives and similar repositories for textgenie:
Users that are interested in textgenie are comparing it to the libraries listed below
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆102Updated 2 years ago
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.☆39Updated last year
- Zero-shot Transfer Learning from English to Arabic☆29Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- NewsQuizQA is a quiz-style question-answer dataset used for generating quiz questions about the news☆34Updated 4 years ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated 2 years ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- A repository for our AAAI-2020 Cross-lingual-NER paper. Code will be updated shortly.☆47Updated 2 years ago
- [EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction☆119Updated 3 years ago
- Resources for the "CTRLsum: Towards Generic Controllable Text Summarization" paper☆146Updated last year
- ☆12Updated 4 years ago
- Multilingual abstractive summarization dataset extracted from WikiHow.☆88Updated 2 weeks ago
- A collection of preprocessed datasets and pretrained models for generating paraphrases.☆29Updated 3 years ago
- A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+☆37Updated 4 years ago
- ☆43Updated last year
- This repository contains datasets and code for the paper "HINT3: Raising the bar for Intent Detection in the Wild" accepted at EMNLP-2020…☆33Updated 4 years ago
- Benchmarking various Deep Learning models such as BERT, ALBERT, BiLSTMs on the task of sentence entailment using two datasets - MultiNLI …☆28Updated 4 years ago
- ☆34Updated 4 years ago
- Build a dialog dataset from online books in many languages☆72Updated 2 years ago
- Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentati…☆37Updated 2 years ago
- A web application that interfaces two GEC systems. [web instance is down]☆31Updated 8 months ago
- Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also pred…☆70Updated 3 years ago
- Coreference Resolution☆75Updated 4 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 3 years ago
- Lexical Simplification with Pretrained Encoders☆70Updated 4 years ago
- The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)☆52Updated 2 years ago
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.☆34Updated 2 years ago
- ☆31Updated 3 years ago
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆55Updated 11 months ago
- ☆84Updated 7 months ago