pemistahl / lingua-py
The most accurate natural language detection library for Python, suitable for short text and mixed-language text
☆1,241Updated last week
Alternatives and similar repositories for lingua-py:
Users that are interested in lingua-py are comparing it to the libraries listed below
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆831Updated 6 months ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆298Updated last month
- 🧹 Python package for text cleaning☆969Updated last year
- Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.☆858Updated 3 weeks ago
- Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).☆1,214Updated this week
- Open neural machine translation models and web services☆655Updated 2 months ago
- ASCII transliterations of Unicode text - GitHub mirror☆543Updated 9 months ago
- A python based HTML to text conversion library, command line client and Web service.☆290Updated last month
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆818Updated 3 weeks ago
- ☆807Updated last year
- Port of Google's language-detection library to Python.☆1,752Updated last year
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆238Updated 2 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆151Updated 3 months ago
- 80x faster and 95% accurate language identification with Fasttext☆146Updated last year
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆150Updated last year
- Article extraction benchmark: dataset and evaluation scripts☆301Updated 9 months ago
- ☆168Updated 8 months ago
- Python bindings to PDFium☆522Updated this week
- ✔️Contextual word checker for better suggestions (not actively maintained)☆413Updated 3 weeks ago
- 📚 Process PDFs, Word documents and more with spaCy☆412Updated last month
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆257Updated 3 months ago
- Fuzzy string matching, grouping, and evaluation.☆751Updated this week
- NeuSpell: A Neural Spelling Correction Toolkit☆684Updated last year
- A Collection of BM25 Algorithms in Python☆1,107Updated 4 months ago
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy☆1,019Updated last month
- ☆349Updated last year
- Training open neural machine translation models☆351Updated 6 months ago
- Benchmarking PDF libraries☆257Updated last year
- Rapid fuzzy string matching in Python using various string metrics☆2,907Updated 3 weeks ago
- A Python library to access ISO country, subdivision, language, currency and script definitions and their translations.☆812Updated this week