rime-aca / corpus
古典中文語料庫
☆281Updated 2 years ago
Alternatives and similar repositories for corpus:
Users that are interested in corpus are comparing it to the libraries listed below
- 漢語拆字字典☆749Updated 2 years ago
- GuwenBERT: 古文预训练语言模型(古文BERT) A Pre-trained Language Model for Classical Chinese (Literary Chinese)☆515Updated 3 years ago
- 汉字拆字库,可以将汉字拆解成偏旁部首,在机器学习中作为汉字的字形特征 | Hanzi Decomposition Library allows Chinese characters to be broken down into radicals and components…☆355Updated 3 months ago
- 汉语古典文本资料库☆262Updated 6 years ago
- 《现代汉语词典》(第7版)全文TXT☆255Updated 7 months ago
- 微信公众号语料库☆574Updated 6 years ago
- 中文相关词典和语料库。☆169Updated 10 years ago
- this repo is a DB for Ancient Chinese Poems and Ancient Chinese Rhyme (Pronunciation).☆92Updated 9 years ago
- 单手笔顺输入法码表 Code table for Chinese stroke sequence (one hand) input method☆96Updated 7 months ago
- course project☆122Updated 5 years ago
- OpenCC made with Python☆542Updated last year
- Simple conversion and localization between simplified and traditional Chinese using tables from MediaWiki.☆530Updated 9 months ago
- 甲言,专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包,支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st NLP toolkit designed for Classical Chinese, supports lexicon co…☆596Updated 3 years ago
- 古汉语(文言文)字典-爬取文言文字典网,制作Kindle字典.☆65Updated 6 years ago
- 词语拼音数据☆463Updated 2 weeks ago
- Chinese word segmentation algorithm without corpus(无需语料库的中文分词)☆497Updated 4 years ago
- 一个中文词库☆347Updated 10 years ago
- Somiao Pinyin: Train your own Chinese Input Method with Seq2seq Model 搜喵拼音输入法☆266Updated 4 years ago
- ☆55Updated 7 years ago
- 诗歌分析程序☆245Updated 7 years ago
- Android App: 漢字古今中外讀音查詢☆230Updated 5 years ago
- 收集非普通話漢語和古漢語的中州韻輸入法拼音方案 Collection of phonetic spelling schemas for Sinitic languages and dialects☆193Updated this week
- THUOCL(THU Open Chinese Lexicon)中文词库☆891Updated last year
- 一个中文的已标注词性的语料库☆198Updated 10 years ago
- NLU is hard!!!☆270Updated 5 years ago
- 比较全的中华古诗古词古文库,包括21万首古诗词,以及注释、赏析等信息,包含10000多名诗人以及诗人的介绍、生平等,同时包含,1600多个词牌介绍,中国70多个朝代解析,和古诗文的近200个分类标签☆331Updated last year
- 近代汉语语料库数据集 自然语言处理 语料库 古代汉语 古汉语 文言文 数字人文 计算语言☆152Updated last month
- 一个轻量且功能全面的中文分词器,帮助学生了解分词器的工作原理。MicroTokenizer: A lightweight Chinese tokenizer designed for educational and research purposes. Provides a…☆146Updated 3 months ago
- Scrape poetry from gushiwen.org☆40Updated 8 years ago
- 汉字数据集,包括汉字的相关信息,例如笔画数、部首、拼音、英文释义/同义词等。☆113Updated 4 years ago