Data augmentation for NLP
β4,645Jun 24, 2024Updated last year
Alternatives and similar repositories for nlpaug
Users that are interested in nlpaug are comparing it to the libraries listed below
Sorting:
- TextAttack π is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocsβ¦β3,369Jul 10, 2025Updated 7 months ago
- Data augmentation for NLP, presented at EMNLP 2019β1,650Mar 19, 2023Updated 2 years ago
- State-of-the-Art Text Embeddingsβ18,323Feb 27, 2026Updated last week
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β14,354Oct 27, 2025Updated 4 months ago
- Beyond Accuracy: Behavioral Testing of NLP models with CheckListβ2,050Jan 9, 2024Updated 2 years ago
- Collection of papers and resources for data augmentation for NLP.β831Aug 12, 2022Updated 3 years ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,170Sep 30, 2025Updated 5 months ago
- Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the moβ¦β22,981Jul 28, 2024Updated last year
- Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conveβ¦β4,231Aug 25, 2025Updated 6 months ago
- [EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821β3,644Oct 16, 2024Updated last year
- TextAugment: Text Augmentation Libraryβ432Dec 10, 2025Updated 2 months ago
- NL-Augmenter π¦ β π A Collaborative Repository of Natural Language Transformationsβ786May 19, 2024Updated last year
- An open-source NLP research library, built on PyTorch.β11,889Nov 22, 2022Updated 3 years ago
- BertViz: Visualize Attention in Transformer Modelsβ7,932Jan 8, 2026Updated last month
- Unsupervised Data Augmentation (UDA)β2,204Aug 28, 2021Updated 4 years ago
- The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic β¦β3,634Feb 20, 2026Updated 2 weeks ago
- skweak: A software toolkit for weak supervision applied to NLP tasksβ926Sep 2, 2024Updated last year
- Leveraging BERT and c-TF-IDF to create easily interpretable topics.β7,426Feb 20, 2026Updated 2 weeks ago
- π Scalable embedding, reasoning, ranking for images and sentences with CLIPβ12,820Jan 23, 2024Updated 2 years ago
- This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"β1,628Jun 12, 2023Updated 2 years ago
- Open source annotation tool for machine learning practitioners.β10,555Feb 17, 2026Updated 2 weeks ago
- Longformer: The Long-Document Transformerβ2,188Feb 8, 2023Updated 3 years ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generatorsβ2,371Mar 23, 2024Updated last year
- Multi-Task Deep Neural Networks for Natural Language Understandingβ2,258Mar 7, 2024Updated last year
- Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining theβ¦β2,083Aug 15, 2024Updated last year
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,489Jan 14, 2026Updated last month
- Super easy library for BERT based NLP modelsβ1,919Aug 19, 2024Updated last year
- A data augmentations library for audio, image, text, and video.β5,071Feb 13, 2026Updated 3 weeks ago
- A system for quickly generating training data with weak supervisionβ5,939May 2, 2024Updated last year
- Minimal keyword extraction with BERTβ4,116Feb 3, 2026Updated last month
- Unsupervised text tokenizer for Neural Network-based text generation.β11,668Feb 22, 2026Updated last week
- XLNet: Generalized Autoregressive Pretraining for Language Understandingβ6,176May 28, 2023Updated 2 years ago
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesβ7,733Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,030Jan 23, 2026Updated last month
- PyTorch original implementation of Cross-lingual Language Model Pretraining.β2,926Feb 14, 2023Updated 3 years ago
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in cβ¦β359Feb 22, 2022Updated 4 years ago
- A Unified Library for Parameter-Efficient and Modular Transfer Learningβ2,801Updated this week
- Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.β1,752Dec 20, 2023Updated 2 years ago
- A python tool for evaluating the quality of sentence embeddings.β2,106Mar 19, 2024Updated last year