ares5221 / Common-NLP-DatasetsLinks

☆17

Alternatives and similar repositories for Common-NLP-Datasets

Users that are interested in Common-NLP-Datasets are comparing it to the libraries listed below

Sorting:

wptoux / albert-chinese-large-webqa
基于百度webqa与dureader数据集训练的Albert Large QA模型
☆77Updated 5 years ago
zhongerqiandan / pretrained-unilm-Chinese
中文版unilm预训练模型
☆83Updated 4 years ago
bojone / nezha_gpt_dialog
☆102Updated 4 years ago
bojone / chinese-gen
中文生成式预训练模型
☆99Updated 5 years ago
CLUEbenchmark / DistilBert
DistilBERT for Chinese 海量中文预训练蒸馏bert模型
☆94Updated 5 years ago
CLUEbenchmark / PyCLUE
Python toolkit for Chinese Language Understanding(CLUE) Evaluation benchmark
☆132Updated 2 years ago
Hongyu-Li / RapGenerator_GPT2
🎵Using GPT2-Chinese to generate rap lyrics🎵
☆29Updated 2 years ago
flowers2023 / lm-ken
kenlm语言模型，并提供python的rest服务
☆30Updated 7 years ago
CLUEbenchmark / LightLM
高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task
☆60Updated 5 years ago
yujunhuics / timeparser
时间抽取、解析、标准化工具
☆55Updated 2 years ago
CLUEbenchmark / MobileQA
离线端阅读理解应用 QA for mobile, Android & iPhone
☆60Updated 2 years ago
howl-anderson / seq2annotation
基于 TensorFlow & PaddlePaddle 的通用序列标注算法库（目前包含 BiLSTM+CRF, Stacked-BiLSTM+CRF 和 IDCNN+CRF，更多算法正在持续添加中）实现中文分词（Tokenizer / segmentation）、词性标注…
☆86Updated 2 years ago
XierHacker / ChineseWordSegment
Tensorflow Implements Chinese Word Segment use LSTM+CRF and Dilated CNN+CRF
☆15Updated 7 years ago
Hanlard / Electra_CRF_NER
We start a company-name recognition task with a small scale and low quality training data, then using skills to enhanced model training s…
☆81Updated 5 years ago
bojone / unsupervised-text-generation
无监督文本生成的一些方法
☆49Updated 4 years ago
GlassyWing / transformer-word-segmenter
Sequence labeling base on universal transformer (Transformer encoder) and CRF; 基于Universal Transformer + CRF 的中文分词和词性标注
☆161Updated 6 years ago
bojone / pytorch_bert_to_tf
pytorch版bert权重转tf
☆22Updated 5 years ago
425776024 / lasertagger-chinese
lasertagger-chinese；lasertagger中文学习案例，案例数据，注释，shell运行
☆76Updated 2 years ago
CLUEbenchmark / CLUEWSC2020
CLUEWSC2020: WSC Winograd模式挑战中文版，中文指代消解任务
☆78Updated 5 years ago
gzhcv / AIChallenger2018_English_Chinese_Machine_Translation
TestB榜第10的方案，bleu32.1
☆64Updated 5 years ago
bojone / shuffle
Python下shuffle几百G文件
☆33Updated 4 years ago
ewrfcas / bert_cn_finetune
Bert finetune for CMRC2018, CJRC, DRCD, CHID, C3
☆184Updated 5 years ago
HillZhang1999 / CTC-Report
CTC2021-中文文本纠错大赛的SOTA方案及在线演示
☆73Updated 2 years ago
liucongg / UnilmChatchitRobot
Unilm for Chinese Chitchat Robot.基于Unilm模型的夸夸式闲聊机器人项目。
☆158Updated 4 years ago
pluto-junzeng / C4-zh
大规模中文语料
☆44Updated 5 years ago
dalinvip / pytorch_Joint-Word-Segmentation-and-POS-Tagging
Paper: A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging
☆35Updated 6 years ago
zejunwang1 / bert4vec
一个基于预训练的句向量生成工具
☆138Updated 2 years ago
Hanlard / Bert-for-WebQA
用BERT在百度WebQA中文问答数据集上做阅读问答
☆65Updated 5 years ago
charlesXu86 / char_featurizer
汉字字符特征提取工具，可以提取出字符中的字音（声母、韵母、声调）、字形（偏旁、部首）、四角编码等特征，同时可作为tensor输入到模型
☆137Updated 5 years ago
nghuyong / bert-classification-tf-serving
Use BERT to train a classification model and deploy the model by tensorflow serving
☆50Updated 4 years ago