saffsd / langid.cLinks
Pure C natural language identifier with support for 97 languages
☆26Updated 8 years ago
Alternatives and similar repositories for langid.c
Users that are interested in langid.c are comparing it to the libraries listed below
Sorting:
- Compact Language Detector 2☆890Updated 4 years ago
- C++ implement of Tomas Mikolov's word/document embedding☆106Updated 8 years ago
- Lightweight C++ translator for OpenNMT Torch models (deprecated)☆81Updated 5 years ago
- Simhashing in C++☆136Updated 2 years ago
- Fast Neural Machine Translation in C++ - development repository☆284Updated 6 months ago
- A simple and fast discriminative sequence labeling toolkit ( http://wapiti.limsi.fr )☆256Updated 3 years ago
- Fast and customizable text tokenization library with BPE and SentencePiece support☆329Updated 3 weeks ago
- Automatically exported from code.google.com/p/chromium-compact-language-detector☆161Updated 5 years ago
- ☆869Updated 2 years ago
- A clone of Darts (Double-ARray Trie System)☆158Updated 8 months ago
- Fast Neural Machine Translation in C++☆1,418Updated 2 years ago
- A Multilingual and Multilevel Representation Learning Toolkit for NLP☆117Updated 7 years ago
- A-C implementation in "C". Tight-packed (interleaved) state-transition matrix -- as fast as it gets, as small as it gets.☆149Updated 5 years ago
- C++ implementation for Neural Network-based NLP, such as LSTM machine translation!☆86Updated 8 years ago
- word2vec++ is a Distributed Representations of Words (word2vec) library and tools implementation, written in C++11 from the scratch☆140Updated 2 years ago
- ☆31Updated 3 years ago
- Simhash and near-duplicate detection☆423Updated 2 years ago
- Bitextor generates translation memories from multilingual websites☆300Updated last year
- Decoder, aligner, and model optimizer for statistical machine translation and other structured prediction models based on (mostly) contex…☆185Updated 5 years ago
- Python bindings for cld3☆27Updated 2 years ago
- Universal dependencies homepage☆40Updated this week
- MARISA: Matching Algorithm with Recursively Implemented StorAge☆594Updated last week
- ZPar statistical parser. Universal language support (depending on the availability of training data), with language-specific features for…☆135Updated 9 years ago
- Embeddable C++17 Unicode library offering UTF encodings, general category info, simple and full casing, normalization forms, and combinin…☆80Updated 4 months ago
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆378Updated 3 years ago
- C++ wrapper library for the NLP library spaCy☆107Updated 2 years ago
- Corpus preprocessing☆99Updated last year
- TheanoLM is a recurrent neural network language modeling tool implemented using Theano☆81Updated last year
- Examples, tutorials and use cases for Marian, including our WMT-2017/18 baselines.☆81Updated 2 years ago
- Fast, efficiently stored Trie for Python. Uses libdatrie.☆547Updated last month