mattzheng / py-kenlm-modelView external linksLinks
python | 高效使用统计语言模型kenlm:新词发现、分词、智能纠错等
☆169Sep 27, 2019Updated 6 years ago
Alternatives and similar repositories for py-kenlm-model
Users that are interested in py-kenlm-model are comparing it to the libraries listed below
Sorting:
- kenlm语言模型,并提供python的rest服务☆30Aug 1, 2018Updated 7 years ago
- KenLM: Faster and Smaller Language Model Queries☆2,734Mar 30, 2025Updated 10 months ago
- 根据维基百科历史编辑数据提取纠错语料。☆12Apr 6, 2022Updated 3 years ago
- 速度更快、效果更好的中文新词发现☆513Mar 15, 2024Updated last year
- 新词发现/新词挖掘/自由度/凝固度/python3☆10May 28, 2019Updated 6 years ago
- pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。☆6,368Jan 12, 2026Updated last month
- CTC+Beam_Search+kenlm 是用于以汉字为声学模型建模单元的解码系统☆48Jun 27, 2018Updated 7 years ago
- SpellGCN☆251Feb 28, 2021Updated 4 years ago
- A No-Recurrence Sequence-to-Sequence Model for Speech Recognition☆379Jul 21, 2022Updated 3 years ago
- 关键词抽取技术☆18Sep 11, 2019Updated 6 years ago
- Speaker Diarization library in Python. Performs VAD, Segmentation, Linear Clustering, Hierarchical Clustering☆15Jul 28, 2017Updated 8 years ago
- python3实现互信息和左右熵的新词发现☆593Aug 1, 2019Updated 6 years ago
- 简洁易用版TinyBert:基于Bert进行知识蒸馏的预训练语言模型☆270Oct 24, 2020Updated 5 years ago
- 2018 高校算法大赛神策杯第五名解决方案☆18Oct 22, 2018Updated 7 years ago
- 2019-SOTA简繁中文拼写检查工具:FASPell Chinese Spell Checker (Chinese Spell Check / 中文拼写检错 / 中文拼写纠错 / 中文拼写检查)☆1,224Sep 3, 2022Updated 3 years ago
- reformer-pytorch中文版本,简单高效的生成模型。类似GPT2的效果☆16Jun 12, 2023Updated 2 years ago
- 📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)☆754Dec 21, 2024Updated last year
- 基于NER的文本纠错☆15Dec 27, 2023Updated 2 years ago
- ChiNese Text Normalization (CNTN) tool for Text-to-speech system☆37Apr 12, 2018Updated 7 years ago
- 基于“音形码”的中文字符串相似度计算方法☆227Jul 24, 2020Updated 5 years ago
- MuCGEC中文纠错数据集及文本纠错SOTA模型开源;Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Gr…☆563Jun 9, 2023Updated 2 years ago
- ☆22Oct 9, 2020Updated 5 years ago
- Online (real-time) decoder to be used with DeepSpeech2 model☆25Feb 27, 2020Updated 5 years ago
- A python module that convert chinese written string to read string. 一个python包:将中文书面字符串转换为口语字符串。☆124Oct 8, 2019Updated 6 years ago
- how to generate the full-contextual labels from un-seen text for the application of HMM-based speech synthesis (HTS)☆12Nov 22, 2019Updated 6 years ago
- python3 利用用TF特征向量和Simhash指纹计算中文文本的相似度的示例☆10Dec 13, 2019Updated 6 years ago
- machine translation data process tools☆10Apr 29, 2024Updated last year
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- Text-Dependent Speaker Recognition System with Machine Learning Techniques☆10Dec 31, 2017Updated 8 years ago
- gensim-fast2vec改造、灵活使用大规模外部词向量(具备OOV查询能力)☆23Jun 3, 2019Updated 6 years ago
- language model in Chinese,基于Pytorch官方文档实现☆68May 22, 2018Updated 7 years ago
- 基于规则的文本纠错系统。☆121Jul 14, 2021Updated 4 years ago
- ccks baidu entity link 实体链接 第一名☆843Dec 19, 2023Updated 2 years ago
- Source code to reproduce results of our paper "DIET: Lightweight Language Understanding for Dialogue Systems"☆64May 12, 2020Updated 5 years ago
- A fast, pure-Python, untyped, in-memory database engine, using Python syntax to manage data, instead of SQL, inspired by PyDbLite.☆20Oct 9, 2017Updated 8 years ago
- Contextual LSTM for NLP tasks like word prediction and word embedding creation for Deep Learning☆28Apr 25, 2019Updated 6 years ago
- Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2☆115May 20, 2019Updated 6 years ago
- ☆276Jan 15, 2021Updated 5 years ago
- CCKS 2019 中文短文本实体链指比赛技术创新奖解决方案☆412Mar 24, 2023Updated 2 years ago