jermp / tongrams_estimation
A C++ library implementing fast language models estimation using the 1-Sort algorithm.
☆17Updated last year
Alternatives and similar repositories for tongrams_estimation:
Users that are interested in tongrams_estimation are comparing it to the libraries listed below
- Fast stand-alone C++ decoder for RNN-based NMT models☆25Updated 4 years ago
- Efficient and effective query auto-completion in C++.☆51Updated last year
- Implementation of QuadSketch algorithm☆11Updated last year
- Anytime Ranking for Impact-Ordered Indexes☆12Updated 8 years ago
- Utilities for manipulating finite state transducers with the OpenFst library.☆30Updated 7 years ago
- MlpIndex - Extremely fast ordered index via memory level parallelism☆12Updated 5 years ago
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆18Updated 2 years ago
- 🌳 A compressed rank/select dictionary exploiting approximate linearity and repetitiveness.☆11Updated 2 years ago
- Fast implementations of the scancount algorithm: C++ header-only library☆26Updated 5 years ago
- Playing with arithmetic coding and RNNs☆22Updated 8 years ago
- Robust Cross-lingual Embeddings from Parallel Sentences☆21Updated 4 years ago
- Universe-sliced indexes in C++.☆18Updated 2 years ago
- An Efficient Language Model Using Double-Array Structures☆17Updated 4 years ago
- A C++ library providing fast language model queries in compressed space.☆128Updated last year
- Development repository for Integrated Speech Corpus Analaysis (ISCAN)☆9Updated 2 years ago
- ☆20Updated 5 years ago
- Risk Minimization Algorithms in Structured Prediction (JMLR 2016)☆13Updated 7 years ago
- A SIMD-based C++ library providing rank/select queries over mutable bitmaps.☆35Updated 2 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Implementation of many similarity join algorithms.☆15Updated 10 years ago
- Scripts supporting the development and serving the Roots Search Tool - https://hf.co/spaces/bigscience-data/roots-search☆10Updated last year
- A Translation Task using TurboTransformers☆11Updated 4 years ago
- Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.☆46Updated 3 weeks ago
- A Space-Optimal Grammar Compression☆10Updated 3 years ago
- Implementation of generative semantic grammar.☆18Updated 2 years ago
- An efficient algorithm for k-bounded (Damerau-)Levenshtein distance☆16Updated 6 years ago
- Fast Neural Machine Translation in C++ - development repository☆19Updated 8 months ago
- zero-vocab or low-vocab embeddings☆18Updated 2 years ago
- A monolingual parallel corpus for sentence simplification☆11Updated 8 years ago
- Library for fast text representation and classification.☆28Updated last year