quincyliang/nlp-data-augmentation

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/quincyliang/nlp-data-augmentation)

quincyliang / nlp-data-augmentation

Data Augmentation for NLP. NLP数据增强

☆294

Alternatives and similar repositories for nlp-data-augmentation

Users that are interested in nlp-data-augmentation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

noisemix / noisemix
View on GitHub
NoiseMix - data generation for natural language
☆40May 26, 2018Updated 8 years ago
zhanlaoban / EDA_NLP_for_Chinese
View on GitHub
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
☆1,383May 31, 2022Updated 4 years ago
yongzhuo / nlp_xiaojiang
View on GitHub
自然语言处理（nlp），小姜机器人（闲聊检索式chatbot），BERT句向量-相似度（Sentence Similarity），XLNET句向量-相似度（text xlnet embedding），文本分类（Text classification），实体提取（ner，b…
☆1,535Sep 23, 2021Updated 4 years ago
zhpmatrix / nlp-competitions-list-review
View on GitHub
复盘所有NLP比赛的TOP方案，只关注NLP比赛，持续更新中！
☆2,804Apr 4, 2026Updated 3 months ago
jasonwei20 / eda_nlp
View on GitHub
Data augmentation for NLP, presented at EMNLP 2019
☆1,651Mar 19, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
425776024 / nlpcda
View on GitHub
一键中文数据增强包； NLP数据增强、bert数据增强、EDA：pip install nlpcda
☆1,879Mar 18, 2025Updated last year
pfnet-research / contextual_augmentation
View on GitHub
Contextual augmentation, a text data augmentation using a bidirectional language model.
☆191Jan 3, 2020Updated 6 years ago
brightmart / albert_zh
View on GitHub
A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
☆3,980Nov 21, 2022Updated 3 years ago
didi / ChineseNLP
View on GitHub
Datasets, SOTA results of every fields of Chinese NLP
☆1,806Apr 7, 2022Updated 4 years ago
sinovation / ZEN
View on GitHub
A BERT-based Chinese Text Encoder Enhanced by N-gram Representations
☆643Jul 24, 2022Updated 4 years ago
pengshuang / Text-Similarity
View on GitHub
Text-Similarity Method in Pytorch
☆468Dec 9, 2018Updated 7 years ago
CLUEbenchmark / CLUE
View on GitHub
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
☆4,273Feb 6, 2026Updated 5 months ago
OYE93 / Chinese-NLP-Corpus
View on GitHub
Collections of Chinese NLP corpus
☆922Dec 28, 2020Updated 5 years ago
huawei-noah / Pretrained-Language-Model
View on GitHub
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
☆3,162Jan 22, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
ewrfcas / bert_cn_finetune
View on GitHub
Bert finetune for CMRC2018, CJRC, DRCD, CHID, C3
☆185Jun 4, 2020Updated 6 years ago
dbiir / UER-py
View on GitHub
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
☆3,110May 9, 2024Updated 2 years ago
namisan / mt-dnn
View on GitHub
Multi-Task Deep Neural Networks for Natural Language Understanding
☆2,259Mar 7, 2024Updated 2 years ago
ChineseGLUE / ChineseGLUE
View on GitHub
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
☆1,783Feb 18, 2023Updated 3 years ago
ymcui / Chinese-BERT-wwm
View on GitHub
Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）
☆10,224Apr 19, 2026Updated 3 months ago
bojone / ee-2019-baseline
View on GitHub
面向金融领域的事件主体抽取（ccks2019），一个baseline
☆118May 13, 2019Updated 7 years ago
ZhuiyiTechnology / pretrained-models
View on GitHub
Open Language Pre-trained Model Zoo
☆1,003Nov 18, 2021Updated 4 years ago
pengming617 / text_matching
View on GitHub
文本匹配的相关模型DSSM,ESIM,ABCNN,BIMPM等，数据集为LCQMC官方数据
☆470May 8, 2022Updated 4 years ago
Jiakui / awesome-bert
View on GitHub
bert nlp papers, applications and github resources, including the newst xlnet ， BERT、XLNet 相关论文和 github 项目
☆1,840Mar 21, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
brightmart / xlnet_zh
View on GitHub
中文预训练XLNet模型: Pre-Trained Chinese XLNet_Large
☆228Sep 13, 2019Updated 6 years ago
tongchangD / text_data_enhancement_with_LaserTagger
View on GitHub
Modify Chinese text, modified on LaserTagger Model. 文本复述，基于lasertagger做中文文本数据增强。
☆320Jan 3, 2024Updated 2 years ago
loujie0822 / DeepIE
View on GitHub
DeepIE: Deep Learning for Information Extraction
☆1,937Dec 9, 2022Updated 3 years ago
brightmart / roberta_zh
View on GitHub
RoBERTa中文预训练模型: RoBERTa for Chinese
☆2,793Jul 22, 2024Updated 2 years ago
brightmart / nlp_chinese_corpus
View on GitHub
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
☆9,906Feb 6, 2026Updated 5 months ago
qiangsiwei / bert_distill
View on GitHub
BERT distillation（基于BERT的蒸馏实验）
☆316Jul 30, 2020Updated 5 years ago
thunlp / OpenCLaP
View on GitHub
Open Chinese Language Pre-trained Model Zoo
☆983Mar 18, 2020Updated 6 years ago
ymcui / Chinese-XLNet
View on GitHub
Pre-Trained Chinese XLNet（中文XLNet预训练模型）
☆1,647Apr 19, 2026Updated 3 months ago
makcedward / nlpaug
View on GitHub
Data augmentation for NLP
☆4,663Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
NTMC-Community / MatchZoo
View on GitHub
Facilitating the design, comparison and sharing of deep text matching models.
☆3,847Aug 2, 2024Updated last year
ymcui / Chinese-ELECTRA
View on GitHub
Pre-trained Chinese ELECTRA（中文ELECTRA预训练模型）
☆1,433Apr 19, 2026Updated 3 months ago
caishiqing / joint-mrc
View on GitHub
机器检索阅读联合学习，莱斯杯：全国第二届“军事智能机器阅读”挑战赛 rank6 方案
☆128Oct 20, 2020Updated 5 years ago
DataTerminatorX / Keyword-BERT
View on GitHub
☆277Dec 8, 2020Updated 5 years ago
padeoe / cail2019
View on GitHub
法研杯2019相似案例匹配第二名解决方案（附数据集和文档）,CAIL2020/2021司法考试赛道冠军队伍
☆251Jun 4, 2021Updated 5 years ago
charlesXu86 / Chatbot_CN
View on GitHub
基于金融-司法领域(兼有闲聊性质)的聊天机器人，其中的主要模块有信息抽取、NLU、NLG、知识图谱等，并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口
☆1,291Jun 13, 2021Updated 5 years ago
yym6472 / ConSERT
View on GitHub
Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
☆542Dec 10, 2021Updated 4 years ago