byronhe / cppjieba
"结巴"中文分词的C++版本,使用 darts Double Array Trie 降低内存占用到 1/100
☆46Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for cppjieba
- BERT Tokenizer in C++☆74Updated 3 years ago
- [本项目不再维护] 将汉字转换为拼音, 支持多音字,拼音 -> pin yin☆206Updated last year
- C++ model train&inference framework☆223Updated 4 years ago
- A clone of Darts (Double-ARray Trie System)☆142Updated 5 years ago
- Self complemented Pinyin2Chinese demo use algorithms including Trie and HMM model , 基于隐马尔科夫模型与Trie树的拼音切分与拼音转中文的简单demo实现。☆84Updated 6 years ago
- Edge Machine Learning Library☆191Updated 2 years ago
- 高性能文本 Tokenizer 库☆26Updated 9 months ago
- C++ headers(hpp) library with Python style.☆127Updated last month
- 词语拼音数据☆448Updated 8 months ago
- python | 高效使用统计语言模型kenlm:新词发现、分词、智能纠错等☆162Updated 5 years ago
- 从Kaldi中裁剪的轻量级语音识别解码推理框架,目前实现了MFCC+GMM+Viterbi,不依赖OpenFST、OpenBLAS等库☆21Updated 3 years ago
- ☆120Updated 3 years ago
- 这个工程的目的是从视频中获取语音识别的训练数据,用于训练字幕自动生成☆53Updated 6 years ago
- KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite☆94Updated 2 years ago
- 语音识别模型pytorch转ONNX转MNN,C++实现部署☆40Updated 2 years ago
- Somiao Pinyin: Train your own Chinese Input Method with Seq2seq Model 搜喵拼音输入法☆266Updated 4 years ago
- 使用 pinyin-data 和 phrase-pinyin-data 中的拼音数据文件覆盖 pypinyin 中的内置拼音数据☆44Updated 8 months ago
- Estimate the phonetic distance between Chinese words and get similar sounding candidate words.☆35Updated last year
- A simple TTS(text-to-speech) engine for Chinese mandarin☆19Updated 12 years ago
- c++ code for merlin tts☆22Updated 5 years ago
- Port of Funasr's Paraformer model in C/C++☆25Updated 4 months ago
- A cross platform implementation of Text-to-Speech based on ONNXRuntime.☆31Updated last year
- 中文单词自动纠错☆121Updated 3 years ago
- PaddleSpeech TTS cpp☆35Updated last year
- A python module that convert chinese written string to read string. 一个python包:将中文书面字符串转换为口语字符串。☆118Updated 5 years ago
- Mirror of SRILM☆53Updated 4 years ago
- a Chinese tokenizer☆17Updated 11 years ago