Russian text segmenter and tokenizer
☆18Mar 2, 2021Updated 5 years ago
Alternatives and similar repositories for rutokenizer
Users that are interested in rutokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Part-of-Speech Tagger for Russian language☆23Jul 29, 2020Updated 5 years ago
- Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке☆42Oct 10, 2025Updated 5 months ago
- Лемматизатор для русскоязычных текстов☆46Jun 4, 2020Updated 5 years ago
- 📚 A small collection of Russian literature 📚☆13Dec 9, 2022Updated 3 years ago
- Простой нормализатор текстов перед синтезом речи☆46May 13, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Грамматический Словарь Русского Языка (+ английский, японский, etc)☆78Aug 10, 2020Updated 5 years ago
- Код для файнтюна LM (rugpt, LLaMa, FRED T5) средствами transformers + deepspeed + LoRa☆14May 22, 2023Updated 2 years ago
- T5-based (russian) text normalization☆26Jan 25, 2024Updated 2 years ago
- Simple Python package for breaking Russian words into syllables☆32Feb 20, 2020Updated 6 years ago
- Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.☆59Feb 27, 2021Updated 5 years ago
- ☆13Aug 7, 2021Updated 4 years ago
- Russian GPT2 model☆61Jul 12, 2021Updated 4 years ago
- [UNSUPPORTED] - please use https://github.com/kmike/pymorphy2. Russian and English morphology analyser (POS tagger + inflection engine) w…☆41Jul 23, 2015Updated 10 years ago
- SpaCy official Russian model proposal☆32Jan 24, 2021Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Multilingual RAG benchmark.☆10Nov 22, 2024Updated last year
- Grammar rules and dictionaries for the phonetic transcription of Russian sentences☆33Sep 23, 2021Updated 4 years ago
- This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.☆16Jul 22, 2021Updated 4 years ago
- Corpus of Russian news articles collected from Lenta.Ru☆145Nov 19, 2022Updated 3 years ago
- My NLP datasets for Russian language☆386Feb 18, 2023Updated 3 years ago
- 学习vLLM,使用vLLM部署Qwen2-0.5B的模型,并使用docker部署。☆20Jun 22, 2024Updated last year
- Implementation of transformer for optical character recognition of russian words☆14Nov 25, 2023Updated 2 years ago
- radiomixer☆14Feb 16, 2022Updated 4 years ago
- ChatGPT Jailbreak promts☆16Mar 22, 2023Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Deep Learning based NLP modeling for Russian language☆243Jul 24, 2023Updated 2 years ago
- Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language☆47Mar 20, 2025Updated last year
- A tool that generates python code out of your GraphQL schema.☆16Nov 27, 2023Updated 2 years ago
- Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.☆26Jul 25, 2024Updated last year
- ☆16Oct 29, 2023Updated 2 years ago
- Проект языковой модели для проведения морфемного анализа, сегментации и токенизации слов русского языка.☆17Jan 10, 2025Updated last year
- 🇷🇺 Punctuation restoration production-ready model for Russian language 🇷🇺☆59Jul 9, 2021Updated 4 years ago
- Solves basic Russian NLP tasks, API for lower level Natasha projects☆1,316Oct 17, 2024Updated last year
- Train punctuation and capitalization models for different languages☆26Apr 2, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Normalize Text in Russian☆28Nov 7, 2023Updated 2 years ago
- MCP server exposing AutoHotkey functionality, enabling model interfaces to automation tasks on Windows.☆16May 21, 2025Updated 10 months ago
- ulauncher Extension to convert timestamp date to human readable date.☆11Jan 25, 2023Updated 3 years ago
- ☆19May 16, 2015Updated 10 years ago
- An ulauncher extension for DuckDuckGo☆15Aug 7, 2021Updated 4 years ago
- Tensorflow implementation of "FloWaveNet: A Generative Flow for Raw Audio"☆25Apr 19, 2019Updated 6 years ago
- A set of scripts and configurations for pretraining of Large Language Models (LLM)☆37Mar 2, 2025Updated last year