zamgi / lingvo--TextSegmenter
Text segmentation into separate words using a simple unigram model and the Viterbi algorithm
☆9Updated last month
Related projects ⓘ
Alternatives and complementary repositories for lingvo--TextSegmenter
- Named entity recognition (NER) in Russian texts / Определение именованных сущносте й (NER) в тексте на русском языке☆40Updated last month
- ☆48Updated 6 years ago
- ANYKS Spell-Checker☆18Updated last year
- Evaluation tools for the RUSSE evaluation campaign.☆37Updated 7 years ago
- ☆34Updated 7 years ago
- ☆55Updated 2 years ago
- SpaCy official Russian model proposal☆31Updated 3 years ago
- ☆28Updated 5 years ago
- A simple and fast rule-based sentence segmentation. Tested on OpenCorpora and SynTagRus datasets.☆53Updated 6 years ago
- 🔬 Очистка датасетов от мусора (нормализация, препроцессинг)☆40Updated 3 years ago
- ☆11Updated 10 months ago
- Автоклассификация текста на русском языке☆11Updated last month
- Natural language processing tools for English and Russian (postagging, syntax parsing, SRL, NER, language detection etc.)☆62Updated 3 weeks ago
- Morphological analyzer `mystem` (Russian language) wrapper for JVM languages☆24Updated 2 months ago
- Large silver standart Russian corpus with NER, morphology and syntax markup☆61Updated last year
- ☆21Updated 3 years ago
- Term extraction for Russian language☆88Updated 5 years ago
- Java port of the pymorphy2☆45Updated 2 years ago
- Morphological Analyzer for Russian 💬☆40Updated 3 years ago
- Convert GitHub to Habr or Dev Markdown with additional features☆22Updated 2 years ago
- Custom Russian tokenizer for spaCy☆42Updated 5 years ago
- Seman is a set of linguistic tools to analyze Russian or German texts, it contains lexicons and grammars. The project is interesting as a…☆84Updated 3 months ago
- Russian data from the SynTagRus corpus.☆80Updated this week
- Inspired by word2vec-pride-vis the replacement of words of Russian most valuable novels text with closest word2vec model words. By Boris …☆46Updated 3 months ago
- Python text speller☆38Updated last month
- Проект для перевода чисел, записанных в текстовом виде на русском языке.☆100Updated 3 years ago
- Probing suite for evaluation of Russian embedding and language models☆32Updated last month
- Python wrapper for PullEnti☆21Updated 4 years ago
- ☆30Updated 5 years ago
- Морфологический анализатор для русского языка на C# для .NET☆48Updated 3 years ago