Data augmentation for NLP
β4,656Jun 24, 2024Updated last year
Alternatives and similar repositories for nlpaug
Users that are interested in nlpaug are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- TextAttack π is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocsβ¦β3,409Apr 17, 2026Updated 2 weeks ago
- Data augmentation for NLP, presented at EMNLP 2019β1,652Mar 19, 2023Updated 3 years ago
- State-of-the-Art Text Embeddingsβ18,615Updated this week
- Collection of papers and resources for data augmentation for NLP.β834Aug 12, 2022Updated 3 years ago
- TextAugment: Text Augmentation Libraryβ435Mar 4, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Beyond Accuracy: Behavioral Testing of NLP models with CheckListβ2,049Jan 9, 2024Updated 2 years ago
- NL-Augmenter π¦ β π A Collaborative Repository of Natural Language Transformationsβ787May 19, 2024Updated last year
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β14,370Oct 27, 2025Updated 6 months ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,212Sep 30, 2025Updated 7 months ago
- Unsupervised Data Augmentation (UDA)β2,206Aug 28, 2021Updated 4 years ago
- Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the moβ¦β22,972Jul 28, 2024Updated last year
- [EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821β3,651Oct 16, 2024Updated last year
- Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conveβ¦β4,239Aug 25, 2025Updated 8 months ago
- An open-source NLP research library, built on PyTorch.β11,891Nov 22, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- BertViz: Visualize Attention in Transformer Modelsβ8,041Jan 8, 2026Updated 3 months ago
- The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic β¦β3,653Apr 15, 2026Updated 3 weeks ago
- skweak: A software toolkit for weak supervision applied to NLP tasksβ926Sep 2, 2024Updated last year
- π Scalable embedding, reasoning, ranking for images and sentences with CLIPβ12,836Jan 23, 2024Updated 2 years ago
- This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"β1,627Jun 12, 2023Updated 2 years ago
- Leveraging BERT and c-TF-IDF to create easily interpretable topics.β7,578Feb 20, 2026Updated 2 months ago
- Open source annotation tool for machine learning practitioners.β10,639Apr 14, 2026Updated 3 weeks ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β160,073Apr 29, 2026Updated last week
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generatorsβ2,372Mar 23, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Longformer: The Long-Document Transformerβ2,194Feb 8, 2023Updated 3 years ago
- A data augmentations library for audio, image, text, and video.β5,082Updated this week
- Unsupervised text tokenizer for Neural Network-based text generation.β11,792Apr 26, 2026Updated last week
- Multi-Task Deep Neural Networks for Natural Language Understandingβ2,257Mar 7, 2024Updated 2 years ago
- A system for quickly generating training data with weak supervisionβ5,957Apr 10, 2026Updated 3 weeks ago
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,513Jan 14, 2026Updated 3 months ago
- Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining theβ¦β2,096Aug 15, 2024Updated last year
- Minimal keyword extraction with BERTβ4,163Feb 3, 2026Updated 3 months ago
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesβ7,783Updated this week
- Open source password manager - Proton Pass β’ AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- PyTorch original implementation of Cross-lingual Language Model Pretraining.β2,932Feb 14, 2023Updated 3 years ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,114Jan 23, 2026Updated 3 months ago
- Super easy library for BERT based NLP modelsβ1,921Aug 19, 2024Updated last year
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in cβ¦β359Feb 22, 2022Updated 4 years ago
- A Unified Library for Parameter-Efficient and Modular Transfer Learningβ2,812Apr 26, 2026Updated last week
- Must-read Papers on pre-trained language models.β3,362Nov 6, 2022Updated 3 years ago
- XLNet: Generalized Autoregressive Pretraining for Language Understandingβ6,177May 28, 2023Updated 2 years ago