Simple conversion and localization between simplified and traditional Chinese using tables from MediaWiki.
☆565Apr 17, 2024Updated last year
Alternatives and similar repositories for zhconv
Users that are interested in zhconv are comparing it to the libraries listed below
Sorting:
- Conversion between Traditional and Simplified Chinese☆9,508Updated this week
- OpenCC made with Python☆568Dec 8, 2023Updated 2 years ago
- Hanzi Converter for Traditional and Simplified Chinese☆189Mar 28, 2020Updated 5 years ago
- 汉字转拼音(pypinyin)☆5,269Feb 15, 2026Updated 3 weeks ago
- Constants used in Chinese text processing☆387Dec 11, 2024Updated last year
- 大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP☆9,864Feb 6, 2026Updated last month
- 结巴中文分词☆34,782Aug 21, 2024Updated last year
- Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)☆10,177Jul 15, 2025Updated 7 months ago
- 100+ Chinese Word Vectors 上百种预训练中文词向量☆12,185Oct 30, 2023Updated 2 years ago
- 📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)☆756Dec 21, 2024Updated last year
- pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。☆6,385Jan 12, 2026Updated last month
- 中文近义词:聊天机器人,智能问答工具包☆5,104Feb 1, 2026Updated last month
- 中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard☆4,235Feb 6, 2026Updated last month
- The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.☆7,692Jan 4, 2026Updated 2 months ago
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)☆8,283Oct 16, 2024Updated last year
- 中文常用停用词表(哈工大停用词表、百度停用词表等)☆5,480Jan 25, 2024Updated 2 years ago
- An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。☆1,386May 31, 2022Updated 3 years ago
- 中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com☆3,800Nov 27, 2025Updated 3 months ago
- it's a train acoustics model code lib☆27May 20, 2020Updated 5 years ago
- Simple voice activity detection (VAD) algorithm in Python☆15Aug 10, 2023Updated 2 years ago
- A curated list of resources for Chinese NLP 中文自然语言处理相关资料☆7,928Jul 27, 2023Updated 2 years ago
- Some meaningless nscripter tools.☆677Jul 8, 2020Updated 5 years ago
- Implementation of Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching (NeurIPS'24)☆59Apr 3, 2025Updated 11 months ago
- 专注于可解释的NLP技术 An NLP Toolset With A Focus on Explainable Inference☆622Feb 3, 2021Updated 5 years ago
- trying to reproduce suno v3☆35Jan 29, 2025Updated last year
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 11 months ago
- Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo☆3,106May 9, 2024Updated last year
- Text Normalization & Inverse Text Normalization☆727Feb 27, 2026Updated last week
- 搜索所有中文NLP数据集,附常用英文NLP数据集☆4,421Nov 21, 2022Updated 3 years ago
- 百度NLP:分词,词性标注,命名实体识别,词重要性☆3,988May 25, 2021Updated 4 years ago
- 中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理☆36,167Nov 15, 2025Updated 3 months ago
- Crowdsourced and Automatic Speech Prominence Estimation☆25Apr 12, 2024Updated last year
- RoBERTa中文预训练模型: RoBERTa for Chinese☆2,774Jul 22, 2024Updated last year
- 中文自然语言处理数据集,平时做做实验的材料。欢迎补充提交合并。☆4,575Nov 21, 2023Updated 2 years ago
- xmnlp:提供中文分词, 词性标注, 命名体识别,情感分析,文本纠错,文本转拼音,文本摘要,偏旁部首,句子表征及文本相似度计算等功能☆1,297Nov 12, 2022Updated 3 years ago
- 百度开源的依存句法分析系统☆1,003Feb 5, 2023Updated 3 years ago
- Cantonese Linguistics and NLP☆397May 23, 2024Updated last year
- Use C Api and Swig to Speed up jieba 高效的中文分词库☆640Aug 27, 2021Updated 4 years ago