Systemcluster / kitokenLinks
Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and WordPiece tokenization in JavaScript, Python and Rust.
☆40Updated 3 months ago
Alternatives and similar repositories for kitoken
Users that are interested in kitoken are comparing it to the libraries listed below
Sorting:
- Locality Sensitive Hashing☆78Updated 2 years ago
- Inference engine for GLiNER models, in Rust☆83Updated 3 weeks ago
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆225Updated 3 weeks ago
- A high-performance constrained decoding engine based on context free grammar in Rust☆58Updated 8 months ago
- wasm bindings for huggingface tokenizers library☆34Updated 3 years ago
- Modular Rust transformer/LLM library using Candle☆38Updated last year
- Optimizing bit-level Jaccard Index and Population Counts for large-scale quantized Vector Search via Harley-Seal CSA and Lookup Tables☆21Updated 8 months ago
- HSNW module for Redis☆59Updated 5 years ago
- A complete(grpc service and lib) Rust inference with multilingual embedding support. This version leverages the power of Rust for both GR…☆39Updated last year
- A Demo server serving Bert through ONNX with GPU written in Rust with <3☆42Updated 4 years ago
- Rust crate for some audio utilities☆26Updated 10 months ago
- Run ONNX and TensorFlow inference in the browser.☆75Updated 3 years ago
- NLP with Rust for Python 🦀🐍