byronhe / cppjieba
"结巴"中文分词的C++版本,使用 darts Double Array Trie 降低内存占用到 1/100
☆48Updated 2 years ago
Alternatives and similar repositories for cppjieba:
Users that are interested in cppjieba are comparing it to the libraries listed below
- transformer tokenizers (e.g. BERT tokenizer) in C++ (WIP)☆16Updated 2 years ago
- mmseg 分词算法c++实现☆33Updated 9 years ago
- wrap cppjieba by swig.☆17Updated 6 years ago
- A clone of Darts (Double-ARray Trie System)☆144Updated 6 years ago
- Edge Machine Learning Library☆193Updated 2 years ago
- [本项目不再维护] 将汉字转换为拼音, 支持多音字,拼音 -> pin yin☆206Updated last year
- 高性能文本 Tokenizer 库☆28Updated last year
- The simple header file library of CppJieba☆40Updated 9 years ago
- BERT Tokenizer in C++☆75Updated 4 years ago
- C++ headers(hpp) library with Python style.☆131Updated last month
- 带AC自动机的C++Trie树实现,支持UTF8和GBK编码☆13Updated 4 years ago
- 词语拼音数据☆467Updated last month
- 中文预处理语料☆106Updated 6 years ago
- ☆99Updated 3 years ago
- A full-text search engine supporting massive users, real-time updating, fast fuzzy matching and flexible table splitting.☆486Updated last year
- C++ model train&inference framework☆223Updated 5 years ago
- Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL☆543Updated 4 years ago
- PaddleSpeech TTS cpp☆36Updated last year
- Self complemented Pinyin2Chinese demo use algorithms including Trie and HMM model , 基于隐马尔科夫模型与Trie树的拼音切分与拼音转中文的简单demo实现。☆86Updated 6 years ago
- 微型中文关键词抽取服务☆55Updated 7 years ago
- 对常用的6700个汉字进行音、形比较,输出音近字、形近字的列表。 # 相近字☆446Updated 10 months ago
- Topling core libraries in an ark☆14Updated 3 months ago
- CppJieba的C语言api☆56Updated 2 years ago
- Read-only unofficial mirror of the OpenGrm Thrax Grammar Development Tools☆14Updated 5 years ago
- An Efficient Lexical Analyzer for Chinese☆40Updated 5 years ago
- Port of Funasr's Paraformer model in C/C++☆28Updated 7 months ago
- 一个非常高效的字符串匹配工具,支持正向/反向最大匹配分词和多模式字符串精确匹配☆17Updated last year
- ☆125Updated 3 years ago
- A release version for https://github.com/athena-team/athena☆126Updated last year
- Somiao Pinyin: Train your own Chinese Input Method with Seq2seq Model 搜喵拼音输入法☆266Updated 4 years ago