CoinLQ / TripitakaCharacterDataset
从大藏经经文图片中切分出的单个字的图片数据集
☆9Updated 8 years ago
Alternatives and similar repositories for TripitakaCharacterDataset:
Users that are interested in TripitakaCharacterDataset are comparing it to the libraries listed below
- MNIST of Tibetan handwriting 国产手写藏文MNIST数据集(TibetanMNIST)的图像分类处理与各种好玩的脑洞~☆31Updated 6 years ago
- 一个面向繁体中文古籍分词的python工具包☆32Updated 3 years ago
- 提取中文的偏旁部首和拼音(一些生僻字的拼音没有补全,待优化)☆43Updated 6 years ago
- 汉字字符特征提取工具,可以提取出字符中的字音(声母、韵母、声调)、字形(偏旁、部首)、四角编码等特征,同时可作为tensor输入到模型☆134Updated 4 years ago
- 古汉语(文言文)字典-爬取文言文字典网,制作Kindle字典.☆66Updated 6 years ago
- 图书名语料库。含部分电影、游戏名称。☆71Updated last year
- 汉字拆字库,可以将汉字拆解成偏旁部首,在机器学习中作为汉字的字形特征 | Hanzi Decomposition Library allows Chinese characters to be broken down into radicals and components…☆365Updated 5 months ago
- 近代汉语语料库数据集 自然语言处理 语料库 古代汉语 古汉语 文言文 数字人文 计算语言☆154Updated 3 weeks ago
- 物种名称语料库。植物名,动物名。☆48Updated last year
- The Tripitaka Koreana in Han (TKH) Dataset and the Multiple Tripitaka in Han (MTH) Dataset for the research of Chinese character detectio…☆63Updated 4 years ago
- 汉字自动拆分系统开发☆102Updated last year
- 收集并整理有关OCR的数据集并统一标注格式,以便实验需要☆12Updated last year
- Ancient Chinese Corpus with Word Sense Annotation☆47Updated 10 months ago
- ☆18Updated 2 years ago
- 英中文本机器翻译☆19Updated 5 years ago
- 竖排书法汉字识别☆64Updated 5 years ago
- 古文语言理解测评基准 Classical Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard☆48Updated last year
- 汉字形近字分布☆13Updated 3 years ago
- A tool for ancient Chinese segmentation.☆53Updated 5 years ago
- 汉字笔画整理,数据来源是一个提供汉字查询的网站☆32Updated 8 years ago
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆66Updated 4 months ago
- 汉字字符特征提取器 (featurizer),提取汉字的特征(发音特征、字形特征)用做深度学习的特征 | A Chinese character feature extractor, which extracts the features of Chinese charac…☆293Updated 4 years ago
- 错别字纠正算法。调用pycorrector接口,使用规则。☆68Updated 5 years ago
- ☆17Updated 7 years ago
- SikuBERT:四库全书的预训练语言模型(四库BERT) Pre-training Model of Siku Quanshu☆125Updated last year
- This is a pre-trained LSTM model. This model can help you to segment unpunctuated historical Chinese texts. 這是基於 LSTM 的預訓練模型。此模型可幫助您為漢語古文…☆25Updated 3 years ago
- ☆28Updated 4 months ago
- An open-source classical Chinese information processing toolkit developed by Tsinghua Natural Language Processing Group☆51Updated 6 years ago
- 一个快速确定文本(新闻)归属地的工具☆18Updated 4 years ago
- 百度百科爬虫☆71Updated 9 months ago