adbar / py3langidLinks
Faster, modernized fork of the language identification tool langid.py
☆56Updated 7 months ago
Alternatives and similar repositories for py3langid
Users that are interested in py3langid are comparing it to the libraries listed below
Sorting:
- Fast and robust date extraction from web pages, with Python or on the command-line☆133Updated 6 months ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆166Updated last month
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆152Updated 2 years ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆65Updated 3 weeks ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆107Updated last month
- Boolean text search in Python☆45Updated 3 weeks ago
- A Python 3 phonetics library.☆133Updated 5 years ago
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Updated 7 months ago
- ☆170Updated 3 months ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated 3 months ago
- Lightning Fast Language Prediction 🚀☆167Updated 6 years ago
- Targetted language identifier, based on FastText and Hunspell.☆36Updated 5 months ago
- Seed Machine Translation Data☆32Updated 8 months ago
- Efficient teacher-student models and scripts to make them☆51Updated last year
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-ha…☆39Updated 2 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Extracts plain text, language identification and more metadata from WARC records☆23Updated 4 months ago
- Abydos NLP/IR library for Python☆186Updated 2 years ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆73Updated 2 weeks ago
- 80x faster and 95% accurate language identification with Fasttext☆158Updated last year
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆248Updated 2 years ago
- Multilingual syllable annotation pipeline component for spacy☆39Updated 2 years ago
- Multi-Langauge Identification☆28Updated 11 months ago
- Legal document classification with EuroVoc descriptors on 22 languages.☆26Updated 2 years ago
- Python 3 library for processing historical English☆67Updated 11 months ago
- 📂 Additional lookup tables and data resources for spaCy☆105Updated last month
- Measure the readability of a given text using surface characteristics☆79Updated 5 months ago
- A modern, interlingual wordnet interface for Python☆254Updated last week
- Extract dates from text☆64Updated 4 years ago