commoncrawl / language-detection-cld2Links
Natural language detection, Java bindings for CLD2
☆14Updated this week
Alternatives and similar repositories for language-detection-cld2
Users that are interested in language-detection-cld2 are comparing it to the libraries listed below
Sorting:
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 7 years ago
- Java port of SymSpell: 1 million times faster through Symmetric Delete spelling correction algorithm☆67Updated 2 months ago
- Rust implementation of Duckling☆79Updated 4 years ago
- Context-sensitive word embeddings with subwords. In Rust.☆87Updated last year
- CommonCrawl WARC/WET/WAT examples and processing code for Java + Hadoop☆37Updated 9 months ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 5 years ago
- Set of Jupyter notebooks demonstrating Learning to Rank integrated with Solr and Elasticsearch☆172Updated 6 months ago
- Common Index File Format to to support interoperability between open-source IR engines☆39Updated last year
- Performance evaluation of nearest neighbor search using Vespa, Elasticsearch and Open Distro for Elasticsearch K-NN☆117Updated 4 years ago
- Various utilities regarding Levenshtein transducers. (Java)☆58Updated 3 years ago
- Search relevance evaluation toolkit☆74Updated 3 years ago
- Lightning Fast Language Prediction 🚀☆167Updated last month
- fastText Rust binding☆62Updated last year
- Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.☆78Updated last year
- My most frequently used learning-to-rank algorithms ported to rust for efficiency. Try it: "pip install fastrank".☆52Updated 6 months ago
- User Behavior Insights standard schema specification☆31Updated last month
- Simple NLP in Rust with Python bindings☆153Updated 2 years ago
- Graph-based Approximate Nearest Neighbor Search☆319Updated last year
- NLP framework for JVM languages.☆151Updated 4 years ago
- Search for similar short strings☆53Updated 5 years ago
- Terrier IR Platform☆266Updated 2 months ago
- Querqy for Elasticsearch☆47Updated last week
- Search engine benchmark (Tantivy, Lucene, PISA, ...)☆92Updated last week
- Ontology for rustling☆130Updated 3 years ago
- Lucene for Information Retrieval☆50Updated 2 years ago
- Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.☆195Updated 2 years ago
- Open-Source Information Retrieval Reproducibility Challenge☆50Updated 9 years ago
- Named Entity Recognition data for Europeana Newspapers☆173Updated 2 years ago
- A language detection Web Service☆53Updated 8 years ago
- The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaW…☆73Updated last year