downloads and parses subtitle dataset from opensubtitles.org
ā15Apr 19, 2024Updated last year
Alternatives and similar repositories for Opensubtitles_dataset
Users that are interested in Opensubtitles_dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Download, parse, and filter data from Court Listener, part of the FreeLaw projects. Data-ready for The-Pile.ā15Jun 3, 2023Updated 2 years ago
- š Resource and Tool for Writing System Identification (Unicode 17.0) -- LREC 2024ā21Feb 17, 2026Updated last month
- OpenAI Codex for Sublime Textā11Sep 25, 2021Updated 4 years ago
- Haskell phonology library.ā10Jan 23, 2012Updated 14 years ago
- Useful prompts for interacting with an AI.ā14Jul 14, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient ⢠AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A simple, minimalist writing theme for Typoraā15Jan 20, 2026Updated 2 months ago
- A TinyStories LM with SAEs and transcodersā14Apr 3, 2025Updated 11 months ago
- ā95Jul 16, 2022Updated 3 years ago
- Demo code for learning_text_transformerā25Feb 22, 2015Updated 11 years ago
- ā21Oct 20, 2022Updated 3 years ago
- An extension of thu-spmi/CAT which contains a full-fledged implementation of CTC-CRF for Tensorflow.ā12Jul 5, 2021Updated 4 years ago
- SemEval 2020 task 10 datasetsā17Feb 19, 2020Updated 6 years ago
- The case study and multilingfual performance of ICASSP submissionā24Sep 24, 2022Updated 3 years ago
- ICU based universal language tokenizerā34Jan 19, 2022Updated 4 years ago
- DigitalOcean Gradient AI Platform ⢠AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- StyleGAN2 - Official TensorFlow Implementationā25Sep 5, 2020Updated 5 years ago
- MaltParser for Russianā12Mar 10, 2019Updated 7 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)ā36Jun 29, 2025Updated 9 months ago
- Python tools for processing the stackexchange data dumps into a text dataset for Language Modelsā86Dec 6, 2023Updated 2 years ago
- simple kv store for streamsā36Mar 14, 2013Updated 13 years ago
- ā32May 23, 2023Updated 2 years ago
- MIDict (Multi-Index Dict) can be indexed by any "keys" or "values", suitable as a bidirectional/inverse dict or a multi-key/multi-value dā¦ā14May 19, 2016Updated 9 years ago
- Precise type-checker for JavaScriptā11Oct 23, 2025Updated 5 months ago
- Myanmar and Thai Language Resourcesā10Jul 18, 2022Updated 3 years ago
- End-to-end encrypted cloud storage - Proton Drive ⢠AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- A utility to read and write PDFs with Pythonā11Apr 28, 2022Updated 3 years ago
- statically generated weekly digest of articles read in Pocketā10May 14, 2019Updated 6 years ago
- A nuxt module to expose Vuex state in the browser URL for easy sharingā12Aug 28, 2017Updated 8 years ago
- Learned string similarity for entity names using optimal transport.ā35Nov 17, 2020Updated 5 years ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.ā29Mar 14, 2025Updated last year
- Creating super-parallel corpora of more than 1500+ unique languages for NLP researchā34Dec 8, 2022Updated 3 years ago
- These are lists for a variety of languages containing words that are distinctive to each language.ā41Apr 5, 2022Updated 3 years ago
- Python wrapper for Google's syntaxnetā15Apr 8, 2019Updated 6 years ago
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model trainingā46Sep 22, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling on Cloudways ⢠AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Dockerized version of Google's SyntaxNet Parser and POS tagger for Russian + standalone server.ā16May 4, 2017Updated 8 years ago
- An abstract, safe, and concise color conversion library for rust nightly This requires the feature adt_const_paramsā12Nov 18, 2022Updated 3 years ago
- A menu and CLI based console program to play and write songs for the PC Speakerā15Aug 1, 2019Updated 6 years ago
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.ā14Jun 27, 2023Updated 2 years ago
- MMSE STSA Speech enhancementā15Aug 24, 2015Updated 10 years ago
- Scheduled, asynchronous JSON fetching for Node.js applicationsā12Mar 19, 2026Updated last week
- generate rules from lists of wordsā16Jul 9, 2021Updated 4 years ago