common-voice / cv-sentence-extractorLinks
Scraping Wikipedia for fair use sentences
β54Updated last year
Alternatives and similar repositories for cv-sentence-extractor
Users that are interested in cv-sentence-extractor are comparing it to the libraries listed below
Sorting:
- Tool to collect and review sentences for Common Voiceβ81Updated 2 years ago
- π software for creating speech recognition models.β159Updated last year
- πΈTTS recipes for different datasetsβ86Updated 3 years ago
- Command line tool to create corpora for Common Voiceβ78Updated last year
- Metadata and versioning details for the Common Voice datasetβ156Updated this week
- Universal Romanizer that can convert any unicode script to roman (latin) scriptβ225Updated last year
- Datasets and tools for basic natural language processing.β386Updated 4 years ago
- Massively multilingual pronunciation miningβ352Updated last month
- Indian Language Tagger and Chunker (Hindi, Telugu, Tamil, Marathi, Punjabi, Kanada, Malayalam, Urdu, Bengali)β42Updated 2 years ago
- Open Source AI Benchmarking toolkit for benchmarking speech to text servicesβ58Updated last year
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).β65Updated this week
- The kinyarwanda model for deepspeechβ16Updated 4 years ago
- β45Updated 7 years ago
- A tool for automatic phoneme transcriptionβ160Updated 2 years ago
- Linguistic processing for Common Voiceβ57Updated last year
- A guide to building language technology in new languages.β59Updated 3 years ago
- Spoken Language Identification on Common Voice and AudioSet using Deep Learningβ40Updated 3 years ago
- Crawler for linguistic corporaβ208Updated last month
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.β40Updated 3 years ago
- Program to benchmark various speech recognition APIsβ80Updated 6 years ago
- Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.β130Updated 4 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.β111Updated last year
- A crash course for training speech recognition models using DeepSpeech.β25Updated 4 years ago
- Open information and community for machine translationβ80Updated this week
- Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environmentsβ103Updated 5 years ago
- πΈSTT integration examplesβ129Updated 3 years ago
- Bitextor generates translation memories from multilingual websitesβ296Updated 11 months ago
- Unicode Standard tokenization routines and orthography profile segmentationβ37Updated 7 months ago
- Convert Arpabet to IPA. Arpabet is the set of phonemes used by the CMU Pronouncing Dictionary. IPA is the International Phonetic Alphabetβ¦β44Updated 5 years ago
- β16Updated 4 years ago