yuikns / icwb2-dataLinks
This directory contains the training, test, and gold-standard data used in the 2nd International Chinese Word Segmentation Bakeoff. Also included is the script used to score the results submitted by the bakeoff participants and the simple segmenter used to generate the baseline and topline data.
☆67Updated 7 years ago
Alternatives and similar repositories for icwb2-data
Users that are interested in icwb2-data are comparing it to the libraries listed below
Sorting:
- NER(命名实体识别)中文语料,一站式获取☆129Updated 5 years ago
- 依存关系分析,NLP,自然语言处理☆85Updated 3 years ago
- 使用BERT模型进行文本分类,相似句子判断,以及词性标注☆89Updated 6 years ago
- SMP2017中文人机对话评测数据☆107Updated 7 years ago
- Word similarity computation based on Tongyici Cilin☆120Updated 8 years ago
- A Chinese word segment model based on BERT, F1-Score 97%☆93Updated 6 years ago
- python CRF++实现分词☆37Updated 7 years ago
- 新词发现 基于词频、凝聚系数和左右邻接信息熵☆122Updated 5 years ago
- Dataset from 'Character-based BiLSTM-CRF Incorporating POS and Dictionaries for Chinese Opinion Target Extraction'☆44Updated 6 years ago
- Bert finetune for CMRC2018, CJRC, DRCD, CHID, C3☆183Updated 5 years ago
- A curated list of resources of chinese corpora for NLP(Natural Language Processing)☆75Updated 5 years ago
- 基于BERT的无监督分词和句法分析☆110Updated 5 years ago
- 中文版unilm预训练模型☆83Updated 4 years ago
- 基于轻量级的albert实现albert+BiLstm+CRF☆89Updated 2 years ago
- 基于BERT的中文序列标注☆141Updated 6 years ago
- 将百度ernie的paddlepaddle模型转成tensorflow模型☆177Updated 5 years ago
- 基于 TensorFlow & PaddlePaddle 的通用序列标注算法库(目前包含 BiLSTM+CRF, Stacked-BiLSTM+CRF 和 IDCNN+CRF,更多算法正在持续添加中)实现中文分词(Tokenizer / segmentation)、词性标注…☆84Updated 2 years ago
- transformers implement (architecture, task example, serving and more)☆95Updated 3 years ago
- 新词发现算法(NewWordDetection)☆62Updated 7 years ago
- WordMultiSenseDisambiguation, chinese multi-wordsense disambiguation based on online bake knowledge base and semantic embedding similarit…☆128Updated 6 years ago
- self complemented SpellCorrection based pinyin similairity, edit distance ,基于拼音相似度与编辑距离的查询纠错。☆83Updated 3 years ago
- NLP NER datasets video/music/book bio☆90Updated 4 years ago
- 利用预训练的中文模型实现基于bert的语义匹配模型 数据集为LCQMC官方数据☆198Updated 5 years ago
- 基于 Bi-LSTM 和 CRF 的中文语义角色标注☆87Updated 6 years ago
- Subword Encoding in Lattice LSTM for Chinese Word Segmentation☆53Updated 6 years ago
- Code for chinese error detection module, using n-gram and bi-lstm☆135Updated 6 years ago
- 基于bert的中文自然语言处理工具,包括情感分析、中文分词、词性标注、以及命名实体识别功能,并提供文本分类任务、序列标注任务、句对关系判断任务的训练与预测接口☆132Updated 6 years ago
- ☆74Updated 2 years ago
- 基于ltp的简单评论观点抽取模块☆116Updated 6 years ago
- 各大中文分词性能评测☆157Updated 6 years ago