近代汉语语料库数据集 自然语言处理 语料库 古代汉语 古汉语 文言文 数字人文 计算语言
☆171Mar 4, 2025Updated last year
Alternatives and similar repositories for Pre-modern_Chinese_corpus_dataset
Users that are interested in Pre-modern_Chinese_corpus_dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An evaluation bentchmark for classical Chinese☆19Dec 13, 2023Updated 2 years ago
- 一个面向繁体中文古籍分词的python工具包☆38Jan 3, 2022Updated 4 years ago
- 甲言,专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包,支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st NLP toolkit designed for Classical Chinese, supports lexicon co…☆673Nov 2, 2021Updated 4 years ago
- SikuBERT:四库全书的预训练语言模型(四库BERT) Pre-training Model of Siku Quanshu☆164Jul 30, 2023Updated 2 years ago
- Ancient Chinese Corpus with Word Sense Annotation☆69May 29, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Annotations and code for the EMNLP 2018 paper 'Weeding out Conventionalized Metaphors: A Corpus of Novel Metaphor Annotations'☆10Feb 20, 2023Updated 3 years ago
- 中文自然语言处理数据集,平时做做实验的材料。欢迎补充提交合并。☆37Dec 3, 2021Updated 4 years ago
- ☆24Aug 24, 2023Updated 2 years ago
- 贵州大学“水纹智识·灵境孪生”项目,由杨秀璋团队指导,在冯静、罗恩瑞、郭春山、李灿灿等努力下共同推进。该资源将应用人工智能技术研究水族文化、文字和古籍,已有多所高校参与。为更好的抢救和保护濒危水族文字和非物质文化遗产,作者申请并开源了该项目,主要通过人工智能技术识别水书,构…☆52Jun 30, 2025Updated 11 months ago
- Raw text of 申報☆27Jan 17, 2022Updated 4 years ago
- A Benchmark for Classical Chinese Based on a Crowdsourcing System.☆60May 25, 2021Updated 5 years ago
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆15Dec 30, 2025Updated 5 months ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆40May 24, 2026Updated 2 weeks ago
- 英文文献的《中国图书馆分类法》自动标注小程序☆13Oct 29, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆23Jun 2, 2019Updated 7 years ago
- 非常全的文言文(古文)-现代文平行语料☆1,445Apr 21, 2024Updated 2 years ago
- The official GitHub repository for AC-EVAL, an ancient Chinese evaluation suite for large language models (LLMs)☆17Nov 12, 2024Updated last year
- 中文古诗词语料库☆28Sep 1, 2016Updated 9 years ago
- 数据字典,汉字字典,汉字库,诗经305首, 人名25万☆24Jul 21, 2022Updated 3 years ago
- classic Chinese punctuate experiment with keras using daizhige(殆知阁古代文献藏书) dataset☆35Dec 8, 2022Updated 3 years ago
- A large corpus of Chinese fixed phrases and idioms scraped from a reputable educational website (30310 instances). 一个大型的中文成语及俗语语料库,内含3031…☆14Oct 29, 2021Updated 4 years ago
- 殆知阁古代文献☆1,565May 13, 2024Updated 2 years ago
- GuwenBERT: 古文预训练语言模型(古文BERT) A Pre-trained Language Model for Classical Chinese (Literary Chinese)☆563Aug 31, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- 大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP☆9,900Feb 6, 2026Updated 4 months ago
- 地球上最全的华语现代诗歌语料库,3k+诗人,80K+诗歌,15M+字☆729Sep 12, 2025Updated 8 months ago
- 中國古代基本典籍☆96May 5, 2025Updated last year
- A toolset for computation and comparison of Chinese dialects☆49May 3, 2026Updated last month
- A tool for text normalisation via character-level machine translation☆13Jun 12, 2020Updated 5 years ago
- <数字人文教程>资源合集☆118May 28, 2024Updated 2 years ago
- Bot Friday Club - BOT5☆40Sep 13, 2023Updated 2 years ago
- a Corpus for Classical Chinese Language Event Extraction☆25Nov 11, 2025Updated 7 months ago
- An Ellipsis-aware Chinese Dependency Treebank for Web Text☆26May 14, 2018Updated 8 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Applied Traditional-Chinese-Handwriting-Dataset to realize handwriting recognition by CNN model.☆36Oct 5, 2023Updated 2 years ago
- 医疗语料库。医疗机构名语料库。药品本位码。☆70Mar 27, 2024Updated 2 years ago
- Deep Learning For Ultrasound Tongue Imaging☆13Dec 17, 2024Updated last year
- 自定义转场动画☆12Dec 9, 2015Updated 10 years ago
- explores Chinese language models with sub-character level visual information☆16Oct 5, 2018Updated 7 years ago
- Generative Art with Context-Free Grammars☆12Oct 15, 2019Updated 6 years ago
- Easy trees in LaTeX and TikZ☆14Dec 16, 2022Updated 3 years ago