yongzhuo / char-similarLinks
字符相似度, 汉字字形/拼音/语义相似度(单字, 可用于数据增强, CSC错别字检测识别任务(构建混淆集)) Chinese character font/pinyin/semantic similarity (single character, can be used for data augmentation, CSC misclassified character detection and recognition tasks (building confusion sets))
☆15Updated 2 weeks ago
Alternatives and similar repositories for char-similar
Users that are interested in char-similar are comparing it to the libraries listed below
Sorting:
- 中文文本纠错相关的论文、比赛和工具。☆59Updated this week
- The hanzi similar tool.(汉字相似度计算工具,中文形近字算法。可用于手写汉字识别纠正,文本混淆等。)☆269Updated last year
- MuCGEC中文纠错数据集及文本纠错SOTA模型开源;Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Gr…☆550Updated 2 years ago
- Code & Data for our Paper "NaSGEC: Multi-Domain Chinese Grammatical Error Correction for Native Speaker Texts" (ACL 2023 Findings)☆90Updated 5 months ago
- text correction papers☆306Updated last year
- SIGHAN中文纠错数据集及转换后格式☆64Updated 5 years ago
- CCL2022汉语学习者文本纠错评测任务赛道二——CGED-8第一名解决方案☆54Updated 2 years ago
- code and data for "CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers"☆71Updated 11 months ago
- 一个基于预训练的句向量生成工具☆137Updated 2 years ago
- ☆78Updated 11 months ago
- 该项目是为了使用layoutlmv3针对中文图片训练和推理。 其中主要解决三个问题: 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作☆53Updated 10 months ago
- Minimal keyword extraction with BERT☆85Updated 3 years ago
- 🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。☆115Updated last year
- 基于模板的文本纠错;Automatically Mining Error Templates for Grammatical Error Correction☆41Updated 3 years ago
- 评估自然语言的流畅度☆116Updated 3 years ago
- A Multi-modal Model Chinese Spell Checker Released on ACL2021.☆160Updated last year
- Source code for the paper "Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granular…☆41Updated 2 years ago
- experiments of some semantic matching models and comparison of experimental results.☆162Updated 2 years ago
- CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)☆247Updated last week
- sentence-transformers to onnx 让sbert模型推理效率更快☆164Updated 3 years ago
- [TALLIP] General and Domain Adaptive Chinese Spelling Check with Error Consistent Pretraining☆58Updated last year
- PyTorch impelementations of BERT-based Spelling Error Correction Models. 基于BERT的文本纠错模型,使用PyTorch实现。☆272Updated 5 months ago
- ChineseBert用于中文拼写纠错☆41Updated 2 years ago
- 3000000+语义理解与匹配数据集。可用于无监督对比学习、半监督学习等构建中文领域效果最好的预训练模型☆299Updated 2 years ago
- CCL 2022 汉语学习者文本纠错评测☆141Updated 2 years ago
- 真 · “Deep Learning for Humans”☆141Updated 3 years ago
- ☆48Updated last year
- 中文文本分类、序列标注工具包(pytorch),支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Chinese text classification and sequence labeling toolk…☆348Updated last year
- 中文标注工具,支持NER、文本分类、关系标注、对话标注等。☆79Updated 11 months ago
- SimCSE在中文上的复现,有监督+无监督☆278Updated 5 months ago