bsolomon1124 / pycld3
Python3 bindings for the Compact Language Detector v3 (CLD3)
β151Updated last year
Alternatives and similar repositories for pycld3:
Users that are interested in pycld3 are comparing it to the libraries listed below
- β168Updated this week
- A fully customisable language detection pipeline for spaCyβ92Updated 5 years ago
- π Additional lookup tables and data resources for spaCyβ105Updated 2 months ago
- πΈ fastText + Bloom embeddings for compact, full-coverage vectors with spaCyβ309Updated last year
- Hunspell extension for spaCy 2.0.β94Updated 8 months ago
- Text tokenization and sentence segmentation (segtok v2)β202Updated 3 years ago
- Fuzzy matching and more functionality for spaCy.β256Updated 8 months ago
- Sentence transformers models for SpaCyβ107Updated 2 years ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further langβ¦β193Updated 2 years ago
- π§ͺ Cutting-edge experimental spaCy components and featuresβ98Updated 11 months ago
- LASER multilingual sentence embeddings as a pip packageβ224Updated last year
- spaCy + UDPipeβ161Updated 2 years ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic feβ¦β169Updated 3 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.β153Updated 10 months ago
- Language independent truecaser in Python.β160Updated 3 years ago
- A python module for English lemmatization and inflection.β266Updated last year
- A python true casing utility that restores case information for textsβ88Updated 2 years ago
- Implementation of the ClausIE information extraction system for python+spacyβ222Updated 2 years ago
- Google USE (Universal Sentence Encoder) for spaCyβ183Updated 2 years ago
- A simple client for doccano API.β84Updated 10 months ago
- Robust and Fast tokenizations alignment library for Rust and Python https://tamuhey.github.io/tokenizations/β190Updated last year
- Information extraction from English and German texts based on predicate logicβ135Updated last year
- Language detection extension for spaCy 2.0+β112Updated 6 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Updated last year
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.β314Updated last month
- π Emoji handling and meta data for spaCy with custom extension attributesβ181Updated last year
- Cython wrapper on Hunspell Dictionaryβ66Updated 9 months ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacingβ69Updated 2 months ago
- Stand-alone WordNet APIβ48Updated 3 years ago
- A spaCy wrapper for DBpedia Spotlightβ109Updated 2 years ago