byronhe / cppjiebaLinks
"结巴"中文分词的C++版本,使用 darts Double Array Trie 降低内存占用到 1/100
☆50Updated 2 years ago
Alternatives and similar repositories for cppjieba
Users that are interested in cppjieba are comparing it to the libraries listed below
Sorting:
- 高性能文本 Tokenizer 库☆29Updated last year
- [本项目不再维护] 将汉字转换为拼音, 支持多音字,拼音 -> pin yin☆211Updated last month
- transformer tokenizers (e.g. BERT tokenizer) in C++ (WIP)☆17Updated 3 years ago
- C++ model train&inference framework☆224Updated 5 years ago
- Edge Machine Learning Library☆195Updated 2 years ago
- A clone of Darts (Double-ARray Trie System)☆149Updated last month
- The simple header file library of CppJieba☆41Updated 10 years ago
- KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite☆95Updated 3 years ago
- Somiao Pinyin: Train your own Chinese Input Method with Seq2seq Model 搜喵拼音输入法☆270Updated 5 years ago
- ☆125Updated 4 years ago
- 从Kaldi中裁剪的轻量级语音识别解码推理框架,目前实现了MFCC+GMM+Viterbi,不依赖OpenFST、OpenBLAS等库☆21Updated 3 years ago
- BERT Tokenizer in C++☆76Updated 4 years ago
- ☆102Updated 3 years ago
- ☆61Updated 2 years ago
- a kws demo on android☆39Updated last year
- CppJieba的C语言api☆57Updated 2 years ago
- 大规模中文语料☆42Updated 5 years ago
- A simple TTS(text-to-speech) engine for Chinese mandarin☆20Updated 13 years ago
- PaddleSpeech TTS cpp☆39Updated 2 years ago
- Self complemented Pinyin2Chinese demo use algorithms including Trie and HMM model , 基于隐马尔科夫模型与Trie树的拼音切分与拼音转中文的简单demo实现。☆86Updated 7 years ago
- a Chinese tokenizer☆17Updated 12 years ago
- An Efficient Lexical Analyzer for Chinese☆43Updated 5 years ago
- Port of Funasr's Paraformer model in C/C++☆32Updated last year
- Mirror of SRILM☆56Updated 4 years ago
- C++ headers(hpp) library with Python style.☆133Updated 2 months ago
- 中文谐音词/字库(同音词/字)Chinese Homophones☆106Updated 5 years ago
- Juicer is a Weighted Finite State Transducer (WFST) based decoder for Automatic Speech Recognition (ASR).☆62Updated 9 years ago
- 带AC自动机的C++Trie树实现,支持UTF8和GBK编码☆13Updated 5 years ago
- Speech-end detection library, based on WebRTC's VAD engine☆23Updated last month
- Chinese "spelling" error correction☆263Updated 7 years ago