kampersanda / tongrams-rs
Rust library providing fast language model queries in compressed space
β23Updated 2 years ago
Alternatives and similar repositories for tongrams-rs:
Users that are interested in tongrams-rs are comparing it to the libraries listed below
- π¦ Rust library of natural language dictionaries using character-wise double-array tries.β29Updated last month
- Rust implementation of SIF and uSIF: Simple and fast sentence embeddingβ19Updated 3 weeks ago
- π A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure. (Python wrapper for daachorse)β16Updated 2 years ago
- Yada is a yet another double-array trie library aiming for fast search and compact data representation.β34Updated 11 months ago
- Collection of succinct data structures in Rustβ82Updated last year
- A tool for visualizing the internal structures of morphological analyzer Sudachiβ17Updated 2 years ago
- Fast match expression optimized for string comparisonβ38Updated last year
- A library for semantic similarity searchβ24Updated 2 weeks ago
- Japanese tokenizer for rustβ34Updated 5 years ago
- IPAdic packaged for easy use from Python.β25Updated 3 years ago
- π A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure in Rust.β209Updated last month
- π¦ A Rust implementation of a RoBERTa classification model for the SNLI datasetβ13Updated 3 years ago
- β25Updated 2 years ago
- An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)β10Updated 8 months ago
- β11Updated 3 years ago
- FAST is an annotation tool that focuses on mobile devices. https://aclanthology.org/2021.emnlp-demo.41/β53Updated 3 years ago
- A small version of UniDic for easy pip installs.β42Updated 4 years ago
- β25Updated 3 months ago
- Finding all pairs of similar documents time- and memory-efficientlyβ58Updated 2 years ago
- A multi-language segmenter using high-order CRF.β17Updated 4 years ago
- Rust binding of primitivβ20Updated 6 years ago
- A C++ library implementing fast language models estimation using the 1-Sort algorithm.β17Updated last year
- optpy is a transpiler to generate a Rust file from a Python fileβ27Updated 2 years ago
- A Japanese tokenizer for Tantivy, based on TinySegmenter.β14Updated 3 years ago
- Yet another sentence-level tokenizer for the Japanese textβ22Updated 2 years ago
- Lindera tokenizer for Tantivy.β55Updated 2 months ago
- CC-CEDICT-MeCab is a MeCab dictionary for Chinese (Mandarin) text segmentationβ11Updated 4 years ago
- Code for COLING 2020 Paperβ13Updated last week
- A Japanese dependency parser based on BERTβ22Updated 2 years ago
- ζ¬θͺε€ζγΏγΉγ―γ«γγγθ©δΎ‘η¨γγΌγΏγ»γγβ20Updated 2 years ago