Russian text segmenter and tokenizer
☆18Mar 2, 2021Updated 5 years ago
Alternatives and similar repositories for rutokenizer
Users that are interested in rutokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Part-of-Speech Tagger for Russian language☆23Jul 29, 2020Updated 5 years ago
- Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке☆42Oct 10, 2025Updated 7 months ago
- Лемматизатор для русскоязычных текстов☆46Jun 4, 2020Updated 5 years ago
- 📚 A small collection of Russian literature 📚☆15Dec 9, 2022Updated 3 years ago
- Код для файнтюна LM (rugpt, LLaMa, FRED T5) средствами transformers + deepspeed + LoRa☆14May 22, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Простой нормализатор текстов перед синтезом речи☆48May 13, 2024Updated 2 years ago
- T5-based (russian) text normalization☆27Jan 25, 2024Updated 2 years ago
- Simple Python package for breaking Russian words into syllables☆32Feb 20, 2020Updated 6 years ago
- Bunch of notebooks for pre-training custom Saiga-like LLM☆12Feb 9, 2024Updated 2 years ago
- Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.☆59Feb 27, 2021Updated 5 years ago
- Russian GPT2 model☆62Jul 12, 2021Updated 4 years ago
- Data and Code for COLM 2025 paper "Retrieval-Augmented Generation with Conflicting Evidence"☆23Apr 18, 2025Updated last year
- Это прототип решения типа Agentic RAG (Retrieval-Augmented Generation) с данными из Jira, Confluence и Git.☆11Dec 4, 2024Updated last year
- Grammar rules and dictionaries for the phonetic transcription of Russian sentences☆33Sep 23, 2021Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.☆16Jul 22, 2021Updated 4 years ago
- Using transformers to generate Russian poetry☆36Aug 21, 2023Updated 2 years ago
- 学习vLLM,使用vLLM部署Qwen2-0.5B的模型,并使用docker部署。☆20Jun 22, 2024Updated last year
- Implementation of transformer for optical character recognition of russian words☆14Nov 25, 2023Updated 2 years ago
- radiomixer☆14Feb 16, 2022Updated 4 years ago
- My own raytracer in one week ⚡☆31Feb 27, 2023Updated 3 years ago
- ChatGPT Jailbreak promts☆15Mar 22, 2023Updated 3 years ago
- Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language☆47Mar 20, 2025Updated last year
- Deep Learning based NLP modeling for Russian language☆246Jul 24, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- DEPRECATED - A webapp for collecting speech samples for voice recognition testing and training☆20May 23, 2019Updated 7 years ago
- A tool that generates python code out of your GraphQL schema.☆16Nov 27, 2023Updated 2 years ago
- Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.☆26Jul 25, 2024Updated last year
- Проект языковой модели для проведения морфемного анализа, сегментации и токенизации слов русского языка.☆17Jan 10, 2025Updated last year
- NoORM (Not only ORM) - Python library that makes your database operations convenient and natural☆17Oct 27, 2025Updated 7 months ago
- Train punctuation and capitalization models for different languages☆26Apr 2, 2022Updated 4 years ago
- Normalize Text in Russian☆29Nov 7, 2023Updated 2 years ago
- Port of BaseFlight (with MultiWii 2.3 features) for STM32F4DISCOVERY board + GY-86 (mpu6050 + hmc5883 + ms5611) sensors board☆15Feb 3, 2014Updated 12 years ago
- ☆19May 16, 2015Updated 11 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Русскоязычный генеративный чатбот с профилем и фактами☆259Jan 20, 2023Updated 3 years ago
- Tensorflow implementation of "FloWaveNet: A Generative Flow for Raw Audio"☆25Apr 19, 2019Updated 7 years ago
- A set of scripts and configurations for pretraining of Large Language Models (LLM)☆35Mar 2, 2025Updated last year
- Grapheme-to-Phoneme conversion with Joint-Sequence RnnLMs☆31Dec 15, 2014Updated 11 years ago
- ☆24Nov 3, 2024Updated last year
- ☆27Aug 29, 2021Updated 4 years ago
- [DEPRECEATED] Morpheus Music AI implementation spin-off :)☆16Oct 5, 2022Updated 3 years ago