explosion / spacy-pkuseg
pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
☆58Updated 7 months ago
Alternatives and similar repositories for spacy-pkuseg:
Users that are interested in spacy-pkuseg are comparing it to the libraries listed below
- 各大中文分词性能评测☆157Updated 6 years ago
- 渊 - A project for Classical Chinese☆102Updated 3 years ago
- 中文标点符号模型,可以给文本添加标点符号。☆140Updated 3 months ago
- A large high-quality corpus of Chinese synonyms 一个大型、高质量的中文同义词语料库。☆47Updated 3 years ago
- A convenient Chinese word segmentation tool 简便中文分词器☆46Updated 3 months ago
- 中文纠错☆92Updated 3 years ago
- Grapheme-to-Phoneme lexicons for Chinese dialects☆68Updated 2 years ago
- ☆173Updated 2 years ago
- Hong Kong Cantonese Corpus of transcribed speech (spontaneous speech, radio programmes and a monologue).☆59Updated last year
- 基于bert进行中文文本纠错☆234Updated last year
- Corpus creator for Chinese Wikipedia☆41Updated 3 years ago
- Pytorch model for https://github.com/imcaspar/gpt2-ml☆79Updated 3 years ago
- Self complemented Pinyin2Chinese demo use algorithms including Trie and HMM model , 基于隐马尔科夫模型与Trie树的拼音切分与拼音转中文的简单demo实现。☆86Updated 6 years ago
- LERT: A Linguistically-motivated Pre-trained Language Model(语言学信息增强的预训练模型LERT)☆206Updated 2 years ago
- 基于mlm方式的带有纠错功能的拼音转汉字bert预训练模型,pinyin correcter,基于pytorch框架实现☆45Updated 4 years ago
- ChineseTextualInference project including chinese corpus build and inferecence model, 中文文本推断项目,包括88万文本蕴含中文文本蕴含数据集的翻译与构建,基于深度学习的文本蕴含判定模型构建…☆172Updated 6 years ago
- 时间抽取、解析、标准化工具☆50Updated 2 years ago
- 基于 g2pW 提升 pypinyin 的准确性☆87Updated last year
- 基于Pytorch 1.0 实现的中文断句与标点符号恢复。☆58Updated 5 years ago
- ☆51Updated 4 years ago
- rasa_chinese 专门针对中文语言的 rasa 组件扩展包,提供了许多针对中文语言的组件☆147Updated last year
- ☆75Updated 2 years ago
- 古文现代文翻译平行语料库☆102Updated 3 years ago
- 📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)☆718Updated 4 months ago
- 基于sentence-transformers实现文本转向量的机器人☆45Updated 2 years ago
- PERT: Pre-training BERT with Permuted Language Model☆360Updated 2 years ago
- ☆28Updated 5 months ago
- This is a pre-trained LSTM model. This model can help you to segment unpunctuated historical Chinese texts. 這是基於 LSTM 的預訓練模型。此模型可幫助您為漢語古文…☆26Updated 3 years ago
- ☆34Updated 3 years ago
- 🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。☆113Updated last year