Simple python lib to tokenize texts into sentences and sentences to words. Small, fast and robust. Comes with ukrainian flavour
☆61Nov 1, 2023Updated 2 years ago
Alternatives and similar repositories for tokenize-uk
Users that are interested in tokenize-uk are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Ukrainian stopwords collection☆11Mar 5, 2020Updated 6 years ago
- Браунський корпус української мови☆118Apr 21, 2026Updated 2 weeks ago
- Site and documents of the lang-uk group☆39Jul 22, 2016Updated 9 years ago
- ☆19Feb 7, 2017Updated 9 years ago
- Stemmer for Ukrainian language in Python☆24Aug 10, 2017Updated 8 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Home of Projector's "Data Science. Natural Language Processing" 2020 Edition☆19Oct 3, 2023Updated 2 years ago
- UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language☆270Feb 11, 2024Updated 2 years ago
- Словник слів українською (слова, словоформи, синтаксичні данні, літературні джерела)☆31May 2, 2017Updated 9 years ago
- A corpus of Ukrainian Twitter texts + instructions for downloading and filtering texts.☆16Jul 4, 2019Updated 6 years ago
- Add accents to words in the Ukrainian language☆15Oct 31, 2022Updated 3 years ago
- Ukrainian lemmatizer plugin for ElasticSearch☆47Mar 22, 2020Updated 6 years ago
- розмічений руками морфо’, синт’, кореф’ корпус української мови☆28Aug 2, 2022Updated 3 years ago
- Flask/Mongo application to provide intuitive web-interface for tasks distribution☆36Feb 19, 2026Updated 2 months ago
- Code for Detecting language from text in python using fasttext☆13May 25, 2020Updated 5 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A tool to extract plain text from HTML pages☆10Dec 7, 2017Updated 8 years ago
- 🇺🇦 Speech Recognition & Synthesis for Ukrainian☆432Sep 12, 2025Updated 7 months ago
- Match tokenized words and phrases within the original, untokenized, often messy, text.☆19Apr 11, 2023Updated 3 years ago
- Deep Reinforcement Learning Agent☆19Dec 9, 2015Updated 10 years ago
- ☆13Feb 11, 2021Updated 5 years ago
- Vosk ASR Docker images with GPU for Jetson boards, PCs, M1 laptops and GPC☆45May 16, 2022Updated 3 years ago
- This is a telegram bot for correcting language mistakes in group chats☆10Jun 29, 2021Updated 4 years ago
- ☆30Mar 25, 2026Updated last month
- ☆17Sep 2, 2017Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- a corpus containing 4.5K conversations from the Conversational Question-Answering dataset CoQA, for a total of 53K follow-up question-ans…☆16Jun 12, 2023Updated 2 years ago
- ☆27Jun 12, 2023Updated 2 years ago
- LEFTJOIN.ru public repository☆24Dec 8, 2022Updated 3 years ago
- All presentations from Data Fest Kyiv 2017 http://datafest.in.ua☆13Apr 24, 2017Updated 9 years ago
- Scripts for "Deploy ML to production" workshop☆23Apr 25, 2018Updated 8 years ago
- Basic scaffold for a Django Rest Framework + React app.☆13Feb 17, 2023Updated 3 years ago
- Unsupervised Key-phrase Extraction and Clustering for Classification Scheme in Scientific Publications.☆19May 24, 2021Updated 4 years ago
- Text language identification using Wikipedia data☆31Aug 15, 2017Updated 8 years ago
- Our project to digitaze and open all declaration of ukrainian officials☆25Mar 6, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- C++ and Python Codes from my projects☆10Feb 29, 2020Updated 6 years ago
- Finetune GPT2 for text summarization☆17Aug 16, 2021Updated 4 years ago
- DEREK (Domain Entities and Relations Extraction Kit)☆10May 22, 2023Updated 2 years ago
- Using NLP Topic Models to automate research paper topic classification.☆13Apr 14, 2021Updated 5 years ago
- Lisp web crawler and scrapper☆27Oct 28, 2017Updated 8 years ago
- ☆23Dec 11, 2024Updated last year
- Duolingo Notes can help the duolingo users save notes during learning.☆13Jun 18, 2014Updated 11 years ago