dbklim / Russian_subtitles_datasetView external linksLinks
Preprocessing of the dataset of 347 subtitles for the TV series (thanks to Taiga Corpus) to build a word2vec model, JamSpell model, neural network training, chat bot training or in any other NLP task.
☆25Jun 14, 2019Updated 6 years ago
Alternatives and similar repositories for Russian_subtitles_dataset
Users that are interested in Russian_subtitles_dataset are comparing it to the libraries listed below
Sorting:
- A fasttrack implementation in python☆13Feb 7, 2026Updated last week
- Simple Python package for breaking Russian words into syllables☆32Feb 20, 2020Updated 5 years ago
- ☆16May 19, 2016Updated 9 years ago
- Modified version of RusStress (https://github.com/MashaPo/russtress) — python package for placing stress in Russian text using RNN (BiLST…☆42Aug 7, 2024Updated last year
- Данные 6-го издания «Грамматического словаря русского языка» А. А. Зализняка (2010) в виде текстовых файлов☆24Sep 17, 2024Updated last year
- Morphological analyzer for Russian and English languages based on neural networks and dictionary-lookup systems.☆157May 22, 2024Updated last year
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Jul 5, 2019Updated 6 years ago
- UDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.☆29May 14, 2025Updated 9 months ago
- InSales e-commerce platform API bindings☆14Jul 13, 2024Updated last year
- Chatbox for Brainster☆10Aug 1, 2020Updated 5 years ago
- Based on Recursive Backtracker algo☆11Oct 25, 2020Updated 5 years ago
- ☆10Jan 6, 2025Updated last year
- SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…☆36Apr 25, 2025Updated 9 months ago
- This repository is about how to build an SQLite version of the Arabic WordNet database.☆10Mar 19, 2019Updated 6 years ago
- A local, voice-controlled AI assistant with the personality of HAL 9000 from 2001: A Space Odyssey.☆20Aug 16, 2025Updated 5 months ago
- MG top-down beam parsing☆13Jul 2, 2018Updated 7 years ago
- Implementation of a fast semantic chunker in C++, installable in python 3.7+ projects.☆22Sep 20, 2025Updated 4 months ago
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆14Dec 19, 2022Updated 3 years ago
- Russian phonetical transcription☆11Nov 19, 2025Updated 2 months ago
- Nanos klib for NVIDIA GPUs☆14Mar 25, 2025Updated 10 months ago
- This is now the official location of the Kaldi project.☆10Aug 22, 2019Updated 6 years ago
- ☆11Dec 14, 2020Updated 5 years ago
- Vector Symbolic Architecture library☆11Mar 27, 2023Updated 2 years ago
- ☆29Dec 20, 2025Updated last month
- Named Entity (NER) annotations of the Hebrew Treebank (Haaretz newspaper) corpus, including: morpheme and token level NER labels, nested …☆10Dec 27, 2021Updated 4 years ago
- A plugin for IDA Pro and Cheat Engine to get the offset of the current module☆11May 30, 2024Updated last year
- Portable library for binary (bi-valued) image processing☆14Jun 12, 2024Updated last year
- Persian Grapheme-to-Phoneme (G2P) converter☆41Jul 25, 2024Updated last year
- Source code for "Unsupervised Lexicon Discovery from Acoustic Input ", Lee et al, 2015 TACL☆10Aug 11, 2016Updated 9 years ago
- A python executable remake of the Fortgotten Empires AoE2 AI Builder that creates AoE2DE-compatible AI files.☆12Jan 9, 2023Updated 3 years ago
- ☆11Updated this week
- ACL Rolling Review website☆11Feb 2, 2026Updated last week
- Tutorial on {Deep} Phonetic Tools given in BigPhon @ LabPhon15☆12Apr 17, 2017Updated 8 years ago
- 🎵 muse: Music Separation☆11Feb 14, 2024Updated 2 years ago
- Natural Language Inflection in English☆11Jan 10, 2022Updated 4 years ago
- ☆10Dec 11, 2016Updated 9 years ago
- Fork of RecurrentGPT with modifications☆10Sep 18, 2024Updated last year
- A wrapper, a lemmatizer and REST API implemented in Python for emMorph (Humor) Hungarian morphological analyzer☆11Feb 11, 2021Updated 5 years ago
- A massively multilingual corpus and pretrained model for IGT☆12Updated this week