jermp / tongrams_estimation
A C++ library implementing fast language models estimation using the 1-Sort algorithm.
☆17Updated last year
Alternatives and similar repositories for tongrams_estimation:
Users that are interested in tongrams_estimation are comparing it to the libraries listed below
- A flexible variational inference LDA library.☆22Updated 5 years ago
- Efficient and effective query auto-completion in C++.☆53Updated last year
- 🌳 A compressed rank/select dictionary exploiting approximate linearity and repetitiveness.☆11Updated 2 years ago
- Implementation of QuadSketch algorithm☆11Updated 2 years ago
- Utilities for manipulating finite state transducers with the OpenFst library.☆30Updated 7 years ago
- A C++ library providing fast language model queries in compressed space.☆129Updated last year
- A fast high dimensional near neighbor search algorithm based on group testing and locality sensitive hashing☆19Updated last year
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆18Updated 2 years ago
- Fast implementations of the scancount algorithm: C++ header-only library☆26Updated 5 years ago
- An Efficient Language Model Using Double-Array Structures☆17Updated 4 years ago
- Implementation of many similarity join algorithms.☆15Updated 10 years ago
- Universe-sliced indexes in C++.☆18Updated 2 years ago
- A SIMD-based C++ library providing rank/select queries over mutable bitmaps.☆35Updated 2 years ago
- A framework for building reranking models.☆28Updated 9 years ago
- Playing with arithmetic coding and RNNs☆22Updated 8 years ago
- Content Addressable Memory using dimensionality reduction☆12Updated 7 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Fast stand-alone C++ decoder for RNN-based NMT models☆25Updated 4 years ago
- An efficient algorithm for k-bounded (Damerau-)Levenshtein distance☆16Updated 6 years ago
- Simplifying parsing of large jsonline files in NLP Workflows☆12Updated 3 years ago
- bin files☆13Updated 3 weeks ago
- Implementation of generative semantic grammar.☆18Updated 2 years ago
- Read-only unofficial mirror of the OpenGrm NGram Library☆8Updated 5 years ago
- Implementation of the data structures described in the paper "Fast Compressed Tries using Path Decomposition".☆55Updated 2 years ago
- Suite of universal indexes for Highly Repetitive Document Collections☆20Updated 4 years ago
- QuickerADC is an implementation of highly-efficient product quantizers leveraging SIMD shuffle instructions integrated into FAISS☆15Updated 5 years ago
- BagMinHash - Minwise Hashing Algorithm for Weighted Sets☆26Updated 4 years ago
- A Space-Optimal Grammar Compression☆10Updated 4 years ago
- Python bindings for the fast integer compression library FastPFor.☆58Updated last year
- A tool for detecting sentence fragments.☆7Updated 8 years ago