A simple and fast rule-based sentence segmentation. Tested on OpenCorpora and SynTagRus datasets.
☆52Jul 4, 2018Updated 7 years ago
Alternatives and similar repositories for ru_sentence_tokenizer
Users that are interested in ru_sentence_tokenizer are comparing it to the libraries listed below
Sorting:
- [experiment] CRF-based disambiguation engine for pymorphy2☆10May 9, 2016Updated 9 years ago
- Rule-based token, sentence segmentation for Russian language☆278Jul 24, 2023Updated 2 years ago
- ☆30Dec 25, 2022Updated 3 years ago
- ☆56May 12, 2018Updated 7 years ago
- Python interface to http://opencorpora.org/☆45Oct 11, 2020Updated 5 years ago
- Краулеры для проекта Taiga Corpus и Taiga Parser, скачивание ресурсов из открытых источников☆14Apr 9, 2019Updated 6 years ago
- RuREBus shared task repo☆29Jan 18, 2021Updated 5 years ago
- http://www.dialog-21.ru/evaluation/2016/letter/☆57Dec 8, 2016Updated 9 years ago
- ANYKS Spell-Checker☆19Jan 3, 2023Updated 3 years ago
- "Rossiya Segodnya" news dataset☆46Sep 25, 2019Updated 6 years ago
- DEREK (Domain Entities and Relations Extraction Kit)☆10May 22, 2023Updated 2 years ago
- The code for the paper 'DetIE: Multilingual Open Information Extraction Inspired by Object Detection' by Vasilkovsky et al.☆20Sep 1, 2022Updated 3 years ago
- A list of pretrained Transformer models for the Russian language.☆177Feb 3, 2020Updated 6 years ago
- ☆51Nov 20, 2017Updated 8 years ago
- Python wrapper for PullEnti☆21Jul 31, 2020Updated 5 years ago
- ☆33Sep 20, 2017Updated 8 years ago
- Gazeta: Dataset for automatic summarization of Russian news / Газета: набор данных для автоматического реферирования на русском языке☆36Oct 6, 2021Updated 4 years ago
- ☆36Dec 8, 2022Updated 3 years ago
- Compact high quality word embeddings for Russian language☆214Jul 24, 2023Updated 2 years ago
- nlp workshop at datafest siberia 2019☆22Dec 8, 2022Updated 3 years ago
- Show summary of a large number of URLs in a Jupyter Notebook☆17Feb 10, 2026Updated 3 weeks ago
- Topic modeling with BigARTM: an interactive book☆60Dec 5, 2018Updated 7 years ago
- System for automatic pronominal resolution for Russian☆14Apr 3, 2020Updated 5 years ago
- Deep Learning based NLP modeling for Russian language☆241Jul 24, 2023Updated 2 years ago
- Russian data from the SynTagRus corpus.☆86Nov 12, 2025Updated 3 months ago
- Samsung Natural Language Processing Pipeline (basically for Russian language): morphology, dependency parser and much more☆59Oct 3, 2020Updated 5 years ago
- ☆18Jun 18, 2021Updated 4 years ago
- Dataset collected from popular Russian collective blog Habrahabr.ru☆13Oct 24, 2016Updated 9 years ago
- Mini-library for producing graph visualizations from embedding models☆28Sep 10, 2020Updated 5 years ago
- Links to Russian corpora + Python functions for loading and parsing☆309Feb 9, 2026Updated 3 weeks ago
- ☆33Feb 14, 2019Updated 7 years ago
- Odnoklassniki api python wrapper☆12Jul 3, 2022Updated 3 years ago
- SpaCy official Russian model proposal☆32Jan 24, 2021Updated 5 years ago
- Accentor and transcriptor for Russian language☆133Jun 19, 2022Updated 3 years ago
- Large silver standart Russian corpus with NER, morphology and syntax markup☆73Jul 24, 2023Updated 2 years ago
- LUNA: a Framework for Language Understanding and Naturalness Assessment.☆12Sep 9, 2023Updated 2 years ago
- ☆13Aug 12, 2019Updated 6 years ago
- Models for automatic abstractive summarization☆174Jul 3, 2022Updated 3 years ago
- Code for "Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking" (https://arxiv.org/abs/2…☆14Feb 2, 2026Updated last month