quincyliang/nlp-public-dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/quincyliang/nlp-public-dataset)

quincyliang / nlp-public-dataset

Chinese, English NER, English-Chinese machine translation dataset. 中英文实体识别数据集，中英文机器翻译数据集, 中文分词数据集

☆372

Alternatives and similar repositories for nlp-public-dataset

Users that are interested in nlp-public-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

benywon / en-ch-NMT
View on GitHub
a neural machine translation system from english (chinese) to chinese (english) based on 30m parallel data.
☆69Mar 31, 2021Updated 5 years ago
CLUEbenchmark / CLUEDatasetSearch
View on GitHub
搜索所有中文NLP数据集，附常用英文NLP数据集
☆4,459Nov 21, 2022Updated 3 years ago
yaleimeng / NER_corpus_chinese
View on GitHub
NER（命名实体识别）中文语料，一站式获取
☆130Sep 10, 2019Updated 6 years ago
VectorFist / RNN-NMT
View on GitHub
基于双向RNN，Attention机制的编解码神经机器翻译模型
☆62Jan 15, 2018Updated 8 years ago
LeeSureman / Flat-Lattice-Transformer
View on GitHub
code for ACL 2020 paper: FLAT: Chinese NER Using Flat-Lattice Transformer
☆1,003May 10, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
InsaneLife / ChineseNLPCorpus
View on GitHub
中文自然语言处理数据集，平时做做实验的材料。欢迎补充提交合并。
☆4,603Nov 21, 2023Updated 2 years ago
brightmart / nlp_chinese_corpus
View on GitHub
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
☆9,904Feb 6, 2026Updated 5 months ago
kunde122 / PaddleNLP
View on GitHub
☆11Dec 9, 2019Updated 6 years ago
Embedding / Chinese-Word-Vectors
View on GitHub
100+ Chinese Word Vectors 上百种预训练中文词向量
☆12,229Oct 30, 2023Updated 2 years ago
ymcui / Chinese-BERT-wwm
View on GitHub
Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）
☆10,223Apr 19, 2026Updated 3 months ago
loujie0822 / DeepIE
View on GitHub
DeepIE: Deep Learning for Information Extraction
☆1,937Dec 9, 2022Updated 3 years ago
lonePatient / BERT-NER-Pytorch
View on GitHub
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
☆2,239Mar 11, 2023Updated 3 years ago
OYE93 / Chinese-NLP-Corpus
View on GitHub
Collections of Chinese NLP corpus
☆921Dec 28, 2020Updated 5 years ago
gzhcv / AIChallenger2018_English_Chinese_Machine_Translation
View on GitHub
TestB榜第10的方案，bleu32.1
☆63Nov 28, 2019Updated 6 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
jiesutd / LatticeLSTM
View on GitHub
Chinese NER using Lattice LSTM. Code for ACL 2018 paper.
☆1,836Apr 25, 2019Updated 7 years ago
v-mipeng / LexiconAugmentedNER
View on GitHub
Reject complicated operations for incorporating lexicon for Chinese NER.
☆437Jan 22, 2022Updated 4 years ago
FudanNLP / TENER
View on GitHub
Codes for "TENER: Adapting Transformer Encoder for Named Entity Recognition"
☆376Jul 6, 2020Updated 6 years ago
foamliu / Machine-Translation
View on GitHub
中英机器文本翻译
☆169Jul 2, 2019Updated 7 years ago
luopeixiang / named_entity_recognition
View on GitHub
中文命名实体识别（包括多种模型：HMM，CRF，BiLSTM，BiLSTM+CRF的具体实现）
☆2,287Jun 21, 2022Updated 4 years ago
panchunguang / ccks_baidu_entity_link
View on GitHub
ccks baidu entity link 实体链接第一名
☆841Dec 19, 2023Updated 2 years ago
ymcui / Chinese-ELECTRA
View on GitHub
Pre-trained Chinese ELECTRA（中文ELECTRA预训练模型）
☆1,433Apr 19, 2026Updated 3 months ago
CLUEbenchmark / CLUE
View on GitHub
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
☆4,271Feb 6, 2026Updated 5 months ago
425776024 / nlpcda
View on GitHub
一键中文数据增强包； NLP数据增强、bert数据增强、EDA：pip install nlpcda
☆1,880Mar 18, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
foamliu / Machine-Translation-v2
View on GitHub
英中机器文本翻译
☆63Jan 2, 2019Updated 7 years ago
bojone / chinese-gen
View on GitHub
中文生成式预训练模型
☆99Aug 28, 2020Updated 5 years ago
crownpku / Awesome-Chinese-NLP
View on GitHub
A curated list of resources for Chinese NLP 中文自然语言处理相关资料
☆7,929Jul 27, 2023Updated 2 years ago
SophonPlus / ChineseNlpCorpus
View on GitHub
搜集、整理、发布中文自然语言处理语料/数据集，与有志之士共同促进中文自然语言处理的发展。
☆6,588Jan 29, 2019Updated 7 years ago
jiachenwestlake / Cross-Domain_NER
View on GitHub
Cross-domain NER using cross-domain language modeling, code for ACL 2019 paper
☆88Jun 3, 2020Updated 6 years ago
z814081807 / DeepNER
View on GitHub
天池中药说明书实体识别挑战冠军方案；中文命名实体识别；NER; BERT-CRF & BERT-SPAN & BERT-MRC；Pytorch
☆967Dec 23, 2020Updated 5 years ago
mmichazzj / Semantic-Role-Labeling
View on GitHub
使用LSTM进行端到端的语义角色标注(theano)
☆55Dec 9, 2019Updated 6 years ago
yhcc / OntoNotes-5.0-NER
View on GitHub
该repo可用于将OntoNotes-5.0转换为Conll格式
☆132Nov 3, 2022Updated 3 years ago
kaiyinzhou / BERT-NER
View on GitHub
Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.
☆1,279May 19, 2022Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
buppt / ChineseNER
View on GitHub
中文命名实体识别，实体抽取，tensorflow，pytorch，BiLSTM+CRF
☆1,464Mar 15, 2020Updated 6 years ago
bqw18744018044 / BertForNER
View on GitHub
一个基于transformers的自定义命名实体识别模型示例
☆17Jul 4, 2021Updated 5 years ago
liu-nlper / SLTK
View on GitHub
序列化标注工具，基于PyTorch实现BLSTM-CNN-CRF模型，CoNLL 2003 English NER测试集F1值为91.10%（word and char feature）。
☆365Jul 24, 2018Updated 7 years ago
thinkwee / UniKeyphrase
View on GitHub
[ACL2021] A Unified Extraction and Generation Framework for Keyphrase Prediction"
☆25Oct 3, 2024Updated last year
renjunxiang / ccks2019_el
View on GitHub
CCKS 2019 Task 2: Entity Recognition and Linking
☆94Aug 6, 2019Updated 6 years ago
Mleader2 / text_scalpel
View on GitHub
Modify Chinese text, modified on LaserTagger Model. I name it "文本手术刀".目前，本项目实现了一个文本复述任务，用于NLP语料的数据增强。
☆215Mar 24, 2023Updated 3 years ago
BrikerMan / Kashgari-doc-zh
View on GitHub
Kashgari 框架的中文文档
☆22Sep 11, 2020Updated 5 years ago