lang-uk / tokenize-uk
Simple python lib to tokenize texts into sentences and sentences to words. Small, fast and robust. Comes with ukrainian flavour
☆60Updated last year
Alternatives and similar repositories for tokenize-uk:
Users that are interested in tokenize-uk are comparing it to the libraries listed below
- Ukranian NER annotation project☆90Updated 9 months ago
- Браунський корпус української мови☆111Updated last week
- This is a project to demonstrate NLP API from LanguageTool for Ukrainian language.☆73Updated 2 weeks ago
- ☆27Updated 2 months ago
- Curated list of Ukrainian natural language processing (NLP) resources (corpora, pretrained models, libriaries, etc.)☆173Updated 3 months ago
- Ukrainian tone dictionary☆47Updated 8 years ago
- ☆20Updated 7 years ago
- UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language☆259Updated 11 months ago
- the list of ~2000 ukrainian stopwords (with numbers)☆58Updated 3 years ago
- розмічений руками морфо’, синт’, кореф’ корпус української мови☆26Updated 2 years ago
- A corpus of Ukrainian Twitter texts + instructions for downloading and filtering texts.☆15Updated 5 years ago
- Набір різноманітних колекцій даних українською мовою зібраний протягом роботи над антикорупційними проектами. CSV–формат, до деяких датас…☆28Updated 2 years ago
- Stemmer for Ukrainian language in Python☆24Updated 7 years ago
- Ukrainian instruction-tuned language models and datasets☆87Updated 6 months ago
- Digital lexicographic systems Ukrainian language + (the grammatical dictionary, synonymous dictionary, etymological dictionary +)☆61Updated 2 years ago
- Dictionary of word stresses in the Ukrainian language 🇺🇦☆19Updated 3 months ago
- Dictionary of obscene words for Ukrainian language☆17Updated 3 years ago
- Adds word stress to Ukrainian texts☆46Updated 3 months ago
- Site and documents of the lang-uk group☆40Updated 8 years ago
- Home of Projector's "Data Science. Natural Language Processing" 2020 Edition☆18Updated last year
- Russian language models for spaCy☆242Updated 3 years ago
- 🇺🇦 Speech Recognition & Synthesis for Ukrainian☆348Updated 3 weeks ago
- ☆26Updated last year
- A web-based engine for creating and annotating textual corpora☆241Updated last year
- Flask/Mongo application to provide intuitive web-interface for tasks distribution☆35Updated 9 months ago
- Large silver standart Russian corpus with NER, morphology and syntax markup☆63Updated last year
- Scripts for updating pymorphy2 dictionaries☆37Updated 8 months ago
- Project to generate POS tag dictionary for Ukrainian language☆565Updated 2 weeks ago
- http://www.dialog-21.ru/evaluation/2016/letter/☆56Updated 8 years ago
- Training scripts for Speech-To-Text models for Ukrainian language☆35Updated last year