mattzheng/py-kenlm-model

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mattzheng/py-kenlm-model)

mattzheng / py-kenlm-model

python | 高效使用统计语言模型kenlm：新词发现、分词、智能纠错等

☆172

Alternatives and similar repositories for py-kenlm-model

Users that are interested in py-kenlm-model are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

flowers2023 / lm-ken
View on GitHub
kenlm语言模型，并提供python的rest服务
☆30Aug 1, 2018Updated 7 years ago
kpu / kenlm
View on GitHub
KenLM: Faster and Smaller Language Model Queries
☆2,793Mar 30, 2025Updated last year
elephantnose / words-mining
View on GitHub
新词发现/新词挖掘/自由度/凝固度/python3
☆10May 28, 2019Updated 7 years ago
bojone / word-discovery
View on GitHub
速度更快、效果更好的中文新词发现
☆512Mar 15, 2024Updated 2 years ago
shibing624 / pycorrector
View on GitHub
pycorrector is a toolkit for text error correction. 文本纠错，实现了Kenlm，T5，MacBERT，ChatGLM3，Qwen2.5等模型应用在纠错场景，开箱即用。
☆6,496Updated this week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
tongchangD / bert_ner_for_corrector
View on GitHub
基于NER的文本纠错
☆15Dec 27, 2023Updated 2 years ago
ACL2020SpellGCN / SpellGCN
View on GitHub
SpellGCN
☆249Feb 28, 2021Updated 5 years ago
Sundy1219 / ctc_beam_search_lm
View on GitHub
CTC+Beam_Search+kenlm 是用于以汉字为声学模型建模单元的解码系统
☆49Jun 27, 2018Updated 8 years ago
ZhengkunTian / OpenTransformer
View on GitHub
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
☆378Jul 21, 2022Updated 4 years ago
damo894127201 / KeywordExtraction
View on GitHub
关键词抽取技术
☆18Sep 11, 2019Updated 6 years ago
nutcrtnk / DHGNet
View on GitHub
Code for paper "Cross-lingual Transfer for Text Classification with Dictionary-based Heterogeneous Graph", EMNLP 2021 - findings.
☆13Dec 14, 2021Updated 4 years ago
jiaohuix / nmt_data_tools
View on GitHub
machine translation data process tools
☆10Apr 29, 2024Updated 2 years ago
iqiyi / FASPell
View on GitHub
2019-SOTA简繁中文拼写检查工具：FASPell Chinese Spell Checker (Chinese Spell Check / 中文拼写检错 / 中文拼写纠错 / 中文拼写检查)
☆1,224Sep 3, 2022Updated 3 years ago
napoler / reformer-chinese
View on GitHub
reformer-pytorch中文版本，简单高效的生成模型。类似GPT2的效果
☆16Jun 12, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Lisennlp / TinyBert
View on GitHub
简洁易用版TinyBert：基于Bert进行知识蒸馏的预训练语言模型
☆271Oct 24, 2020Updated 5 years ago
BitSpeech / SRILM
View on GitHub
Mirror of SRILM
☆61Aug 11, 2020Updated 5 years ago
Dikea / Shence-Cup-Keyword-Extractor
View on GitHub
2018 高校算法大赛神策杯第五名解决方案
☆18Oct 22, 2018Updated 7 years ago
zhanzecheng / Chinese_segment_augment
View on GitHub
python3实现互信息和左右熵的新词发现
☆593Aug 1, 2019Updated 6 years ago
hpandana / gradient-accumulation-tf-estimator
View on GitHub
Gradient accumulation on tf.estimator
☆12Dec 15, 2020Updated 5 years ago
ictnlp / NMLA-NAT
View on GitHub
Code for NeurIPS 2022 Spotlight paper " Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation"
☆20Nov 16, 2022Updated 3 years ago
li-aolong / li-aolong.github.io
View on GitHub
李傲龍的博客
☆81Jul 17, 2024Updated 2 years ago
igormq / ctcdecode-pytorch
View on GitHub
Python implementation of CTC beam search decoder + agnostic LM scorer
☆20Dec 16, 2020Updated 5 years ago
songyaheng / sparkNLP
View on GitHub
spark,NLP,新词发现,自然语言处理
☆23Mar 16, 2018Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
pogevip / BERT4TensorRT
View on GitHub
TensorRT
☆11Sep 22, 2020Updated 5 years ago
hiyoung123 / YoungCorrector
View on GitHub
基于规则的文本纠错系统。
☆121Jul 14, 2021Updated 5 years ago
Ailln / cn2an
View on GitHub
📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）
☆765Apr 23, 2026Updated 3 months ago
EdisonChen0816 / chatbot
View on GitHub
闲聊机器人
☆11Aug 12, 2020Updated 5 years ago
liwenju0 / error_text_gen
View on GitHub
用于生成文本纠错模型(如Gector)需要的大量数据。
☆15Jan 5, 2023Updated 3 years ago
geeklili / TextRank_Algorithm
View on GitHub
TextRank的简单实现
☆10Nov 12, 2020Updated 5 years ago
Chuanyunux / Chinese-NewWordRecognition
View on GitHub
专业领域词库构建/中文新词发现/专业词库发现
☆31Jan 10, 2020Updated 6 years ago
tongchangD / bert_for_corrector
View on GitHub
基于bert进行中文文本纠错
☆242Jun 12, 2023Updated 3 years ago
kmario23 / KenLM-training
View on GitHub
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
☆116May 20, 2019Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HillZhang1999 / MuCGEC
View on GitHub
MuCGEC中文纠错数据集及文本纠错SOTA模型开源；Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Gr…
☆570Jun 9, 2023Updated 3 years ago
bojone / text_compare
View on GitHub
用python比较两个字符串差异，高亮差异部分
☆27Jul 20, 2020Updated 6 years ago
zedom1 / Error-Detection
View on GitHub
Code for chinese error detection module, using n-gram and bi-lstm
☆136Mar 31, 2019Updated 7 years ago
clue-ai / PromptCLUE
View on GitHub
PromptCLUE, 全中文任务支持零样本学习模型
☆663Jun 16, 2023Updated 3 years ago
mattzheng / gensim-fast2vec
View on GitHub
gensim-fast2vec改造、灵活使用大规模外部词向量（具备OOV查询能力）
☆23Jun 3, 2019Updated 7 years ago
huawei-noah / Pretrained-Language-Model
View on GitHub
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
☆3,162Jan 22, 2024Updated 2 years ago
open-speech / cn-text-normalizer
View on GitHub
A python module that convert chinese written string to read string. 一个python包：将中文书面字符串转换为口语字符串。
☆124Oct 8, 2019Updated 6 years ago