yukiyuqichen / OCR-Toolkit
A cute toolkit for OCR with GUI, including image preprocessing and text recognition. Works out of the box. 一只小小的OCR工具箱,包括图像预处理和文字识别等功能,开箱即用。
☆12Updated last year
Alternatives and similar repositories for OCR-Toolkit:
Users that are interested in OCR-Toolkit are comparing it to the libraries listed below
- 古籍识别☆12Updated 3 years ago
- GuwenModels: 古文自然语言处理模型合集, 收录互联网上的古文相关模型及资源. A collection of Classical Chinese natural language processing models, including Classical Ch…☆173Updated last year
- 菜谱名语料库。☆15Updated 3 years ago
- Recognize tables and text from scanned images that contain tables. 从包含表格的扫描图片中识别表格和文字☆253Updated last year
- Chinese character variant converter. 中文异体字转换器。☆16Updated 4 months ago
- 从大藏经经文图片中切分出的单个字的图片数据集☆9Updated 8 years ago
- 医疗语料库。医疗机构名语料库。药品本位码。☆69Updated last year
- 古文现代文翻译平行语料库☆101Updated 3 years ago
- 图书名语料库。含部分电影、游戏名称。☆71Updated last year
- ☆28Updated 4 months ago
- Ancient Chinese Corpus with Word Sense Annotation☆47Updated 10 months ago
- <数字人文教程>资源合集☆95Updated 10 months ago
- 渊 - A project for Classical Chinese☆100Updated 3 years ago
- classic Chinese punctuate experiment with keras using daizhige(殆知阁古代文献藏书) dataset☆34Updated 2 years ago
- 古汉语(文言文)字典-爬取文言文字典网,制作Kindle字典.☆66Updated 6 years ago
- 小说人名统计和关系提取(基于HanLP)☆39Updated 5 years ago
- 古文语言理解测评基准 Classical Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard☆48Updated last year
- 物种名称语料库。植物名,动物名。☆48Updated last year
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆66Updated 4 months ago
- 中文、分词、词表、核心词典、事件词表、停用词、敏感词、问答、问答数据、知识图谱、文本语料。☆159Updated 3 years ago
- 一个快速确定文本(新闻)归属地的工具☆18Updated 4 years ago
- 一个面向繁体中文古籍分词的python工具包☆32Updated 3 years ago
- 百度汉语字典爬虫,拼音数据,35万海量百度词典数据。☆24Updated 2 years ago
- CCL 2023 古汉语通假字语料库的构建及应用研究:通假字资源库☆15Updated last year
- python 数地工厂 NLPSDK 关键词提取 摘要提取 新词发现 事件三元组提取 数据三元组提取 逻辑三元组提取 实体识别 短语组块识别 相似度计算 概念抽象 语义联想 情感极性判定 情感对提取 实体属性情感提取 主观性计算 网页正文解析 网页表格解析 实体链接 问题解…☆17Updated 4 years ago
- 汉字字符特征提取工具,可以提取出字符中的字音(声母、韵母、声调)、字形(偏旁、部首)、四角编码等特征,同时可作为tensor输入到模型☆134Updated 4 years ago
- This is a corpus of Chinese abbreviation, including negative full forms.☆194Updated 3 years ago
- 一个相对完整的文档分析和识别项目☆143Updated 5 years ago
- GuwenBERT: 古文预训练语言模型(古文BERT) A Pre-trained Language Model for Classical Chinese (Literary Chinese)☆524Updated 3 years ago
- ☆33Updated 2 years ago