Datasets and tools for basic natural language processing.
☆389Sep 10, 2021Updated 4 years ago
Alternatives and similar repositories for language-resources
Users that are interested in language-resources are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Text-to-Speech tutorial at SLTU 2016☆35May 10, 2016Updated 9 years ago
- ☆213Jun 16, 2018Updated 7 years ago
- A simple tutorial on setting up Sparrowhawk - a text-to-speech normalization engine☆14Oct 16, 2017Updated 8 years ago
- Covering grammars for English and Russian text normalization☆61Sep 15, 2019Updated 6 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- 💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies☆1,392Jun 6, 2024Updated last year
- Read-only unofficial mirror of Pynini☆17May 7, 2019Updated 6 years ago
- Massively multilingual pronunciation mining☆362Mar 3, 2026Updated last month
- ☆17Jul 29, 2018Updated 7 years ago
- SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…☆36Apr 25, 2025Updated 11 months ago
- 🎯 Speech Recognition Challenge by Speech Lab - IIT Madras☆10Nov 5, 2020Updated 5 years ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆23Mar 21, 2021Updated 5 years ago
- Code for ICASSP 2019 paper☆18Oct 29, 2018Updated 7 years ago
- Thai smart home corpus with "Gowajee" hotword☆18Jul 30, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Read-only unofficial mirror of the OpenGrm Thrax Grammar Development Tools☆16May 2, 2019Updated 6 years ago
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago
- Crawler for linguistic corpora☆213Aug 18, 2025Updated 7 months ago
- Automatically exported from code.google.com/p/transducersaurus☆11Apr 1, 2015Updated 11 years ago
- CMU Wilderness Multilingual Speech Dataset☆292Apr 20, 2019Updated 6 years ago
- Data and code for grapheme-to-phoneme transducers in lots of languages☆151Apr 5, 2024Updated 2 years ago
- Awesome Lao Natural Language Processing☆16Mar 7, 2025Updated last year
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Feb 15, 2024Updated 2 years ago
- G2P with Tensorflow☆681Jul 29, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆14Jun 12, 2015Updated 10 years ago
- CoVoST: A Large-Scale Multilingual Speech-To-Text Translation Corpus (CC0 Licensed)☆394Sep 14, 2021Updated 4 years ago
- Corpus of oral arguments (recorded speech + official transcripts) of the United States Supreme Court☆22Dec 8, 2022Updated 3 years ago
- CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages☆483Mar 6, 2020Updated 6 years ago
- Simple text to phones converter for multiple languages☆1,530Sep 26, 2024Updated last year
- Open tools and data for cloudless automatic speech recognition☆446Mar 30, 2021Updated 5 years ago
- A GPU language model, based on btree backed tries.☆29Mar 6, 2018Updated 8 years ago
- A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)☆804Mar 25, 2026Updated 2 weeks ago
- This is now the official location of the Merlin project.☆1,321Mar 3, 2020Updated 6 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆20Jul 22, 2022Updated 3 years ago
- Read-only unofficial mirror of OpenFst☆44May 15, 2022Updated 3 years ago
- An opensource text-to-speech (TTS) voice building tool☆684Jul 22, 2024Updated last year
- ☆45Oct 24, 2020Updated 5 years ago
- VCTK multi-speaker tacotron for ICASSP 2020☆266Mar 29, 2022Updated 4 years ago
- The Dakshina dataset is a collection of text in both Latin and native scripts for 12 South Asian languages. For each language, the datase…☆206May 27, 2020Updated 5 years ago
- Lao language Natural Language Processing toolkit☆34Jan 9, 2026Updated 3 months ago