byronhe / cppjiebaLinks
"结巴"中文分词的C++版本,使用 darts Double Array Trie 降低内存占用到 1/100
☆50Updated 2 years ago
Alternatives and similar repositories for cppjieba
Users that are interested in cppjieba are comparing it to the libraries listed below
Sorting:
- KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite☆95Updated 3 years ago
- Edge Machine Learning Library☆195Updated 2 years ago
- [本项目不再维护] 将汉字转换为拼音, 支持多音字,拼音 -> pin yin☆211Updated 2 months ago
- CppJieba的C语言api☆58Updated 2 years ago
- The simple header file library of CppJieba☆41Updated 10 years ago
- C++ headers(hpp) library with Python style.☆135Updated 3 months ago
- 词语拼音数据☆491Updated 3 months ago
- mmseg 分词算法c++实现☆33Updated 9 years ago
- transformer tokenizers (e.g. BERT tokenizer) in C++ (WIP)☆17Updated 3 years ago
- Somiao Pinyin: Train your own Chinese Input Method with Seq2seq Model 搜喵拼音输入法☆271Updated 5 years ago
- ☆102Updated 3 years ago
- An Efficient Lexical Analyzer for Chinese☆812Updated 2 years ago
- C++ model train&inference framework☆223Updated 5 years ago
- a Chinese tokenizer☆17Updated 12 years ago
- 对常用的6700个汉字进行音、形比较,输出音近字、形近字的列表。 # 相近字☆461Updated last year
- 高性能文本 Tokenizer 库☆29Updated last year
- A full-text search engine supporting massive users, real-time updating, fast fuzzy matching and flexible table splitting.☆490Updated 2 years ago
- An Efficient Lexical Analyzer for Chinese☆44Updated 5 years ago
- 从Kaldi中裁剪的轻量级语音识别解码推理框架,目前实现了MFCC+GMM+Viterbi,不依赖OpenFST、OpenBLAS等库☆21Updated 3 years ago
- BERT Tokenizer in C++☆76Updated 4 years ago
- 中文文档simhash值计算☆1,145Updated 2 months ago
- 汉字转拼音占内存更少转换速度更快☆39Updated 9 years ago
- 最好的汉字数字(中文数字)-阿拉伯数字转换工具。包含"点二八","负百分之四十"等众多汉语表达方法。NLP,机器人工程必备! The Best Tool of Chinese Number to Digits☆367Updated 2 years ago
- 中文单词自动纠错☆121Updated 4 years ago
- Constants used in Chinese text processing☆373Updated 7 months ago
- 拼音转汉字, 拼音输入法引擎, pin yin -> 拼音☆615Updated 2 months ago
- Port of Funasr's Paraformer model in C/C++☆32Updated last year
- Real time vector search engine☆138Updated 2 years ago
- 汉字拆字库,可以将汉字拆解成偏旁部首,在机器学习中作为汉字的字形特征 | Hanzi Decomposition Library allows Chinese characters to be broken down into radicals and components…☆382Updated 9 months ago
- 集成Webrtc的VAD,用于切分音频文件☆340Updated 4 years ago