indix / whatthelang
Lightning Fast Language Prediction π
β165Updated 5 years ago
Alternatives and similar repositories for whatthelang:
Users that are interested in whatthelang are comparing it to the libraries listed below
- Hunspell extension for spaCy 2.0.β94Updated 7 months ago
- Language detection extension for spaCy 2.0+β112Updated 6 years ago
- π Emoji handling and meta data for spaCy with custom extension attributesβ181Updated last year
- spaCy + UDPipeβ161Updated 2 years ago
- Language independent truecaser in Python.β160Updated 3 years ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic feβ¦β169Updated 3 years ago
- A fully customisable language detection pipeline for spaCyβ92Updated 5 years ago
- A spell-checker extending Peter Norvig's with multi-typo correction, hamming distance weighting, and more.β98Updated 4 years ago
- Fast supervised sentence boundary detection using the averaged perceptronβ90Updated 6 years ago
- Cython wrapper on Hunspell Dictionaryβ66Updated 8 months ago
- Textpipe: clean and extract metadata from textβ302Updated 3 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)β150Updated last year
- A small tool that EXPLains spACY parse results. See what I did there?β83Updated 3 years ago
- Intelligently expand and create contractions in text leveraging grammar checking and Word Mover's Distance.β75Updated 3 years ago
- A compound word splitter for Pythonβ48Updated 3 years ago
- A minimal, pure Python library to interface with CoNLL-U format files.β148Updated last year
- Server/Client around Spacy to load spacy only onceβ46Updated 7 years ago
- A python true casing utility that restores case information for textsβ88Updated 2 years ago
- π« Scripts, tools and resources for developing spaCyβ125Updated 5 years ago
- Parse natural language time expressions in pythonβ131Updated 2 years ago
- A way to do annotations for NER. TALEN: Tool for Annotation of Low-resource ENtitiesβ113Updated 2 years ago
- Misspelling Oblivious Word Embeddingsβ203Updated 5 years ago
- Named Entity Recognition data for Europeana Newspapersβ171Updated last year
- Text tokenization and sentence segmentation (segtok v2)β201Updated 2 years ago
- Anonymization of legal cases (Fr) based on Flair embeddingsβ88Updated 4 years ago
- General-Purpose Neural Networks for Sentence Boundary Detectionβ72Updated last year
- spaCy pipeline component for adding text readability meta data to Doc objects.β56Updated 5 years ago
- xfspell β the Transformer Spell Checkerβ188Updated 4 years ago
- πNatural language processing (NLP) utils: word embeddings (Word2Vec, GloVe, FastText, ...) and preprocessing transformers, compatible wiβ¦β62Updated last year
- Fast Word Clustering Softwareβ78Updated 3 weeks ago