explosion / spacy-pkusegLinks
pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
☆68Updated 5 months ago
Alternatives and similar repositories for spacy-pkuseg
Users that are interested in spacy-pkuseg are comparing it to the libraries listed below
Sorting:
- 📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)☆752Updated last year
- A large high-quality corpus of Chinese synonyms 一个大型、高质量的中文同义词语料库。☆69Updated 4 years ago
- 渊 - A project for Classical Chinese☆110Updated 3 years ago
- A convenient Chinese word segmentation tool 简便中文分词器☆51Updated 7 months ago
- 最好的汉字数字(中文数字)-阿拉伯数字转换工具。包含"点二八","负百分之四十"等众多汉语表达方法。NLP,机器人工程必备! The Best Tool of Chinese Number to Digits☆371Updated 2 years ago
- 中文标点符号模型,可以给文本添加标点符号。☆147Updated last year
- MiniRBT (中文小型预训练模型系列)☆298Updated 5 months ago
- ✨ Split text by languages (e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗) for NLP tasks (e.g. parse, TTS). Powered by fasttext and budoux☆67Updated 3 months ago
- 基于 g2pW 提升 pypinyin 的准确性☆102Updated 2 years ago
- 中文、分词、词表、核心词典、事件词表、停用词、敏感词、问答、问答数据、知识图谱、文本语料。☆171Updated 4 years ago
- CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)☆258Updated 5 months ago
- Text Normalization & Inverse Text Normalization☆716Updated last month
- ☆127Updated 4 years ago
- 各大中文分词性能评测☆159Updated 6 years ago
- Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)☆370Updated 6 months ago
- Simple conversion and localization between simplified and traditional Chinese using tables from MediaWiki.☆560Updated last year
- PERT: Pre-training BERT with Permuted Language Model☆366Updated 5 months ago
- 中文日期/时间/数字量提取工具☆69Updated 5 years ago
- 中文文本相似度计算器☆167Updated last year
- 基于Pytorch 1.0 实现的中文断句与标点符号恢复。☆58Updated 6 years ago
- 词语拼音数据☆509Updated 5 months ago
- 使用 pinyin-data 和 phrase-pinyin-data 中的拼音数据文件覆盖 pypinyin 中的内置拼音数据☆66Updated 11 months ago
- A small package to fuzzy match chinese words☆94Updated 2 years ago
- Grapheme-to-Phoneme lexicons for Chinese dialects☆69Updated 3 years ago
- Estimate the phonetic distance between Chinese words and get similar sounding candidate words.☆38Updated 3 months ago
- LERT: A Linguistically-motivated Pre-trained Language Model(语言学信息增强的预训练模型LERT)☆221Updated 5 months ago
- clueai工具包: 3行代码3分钟,自定义需要的API!☆232Updated 2 years ago
- 汉字数据集,包括汉字的相关信息,例如笔画数、部首、拼音、英文释义/同义词等。☆125Updated 5 years ago
- llama.cpp with unicode (windows) support☆54Updated 2 years ago
- 粤语分词工具☆48Updated 7 years ago