common-voice / cv-sentence-extractorLinks
Scraping Wikipedia for fair use sentences
β54Updated last year
Alternatives and similar repositories for cv-sentence-extractor
Users that are interested in cv-sentence-extractor are comparing it to the libraries listed below
Sorting:
- π software for creating speech recognition models.β160Updated last year
- Command line tool to create corpora for Common Voiceβ78Updated last month
- Open Source AI Benchmarking toolkit for benchmarking speech to text servicesβ58Updated last year
- Massively multilingual pronunciation miningβ359Updated 4 months ago
- A tool for automatic phoneme transcriptionβ159Updated 2 years ago
- πΈTTS recipes for different datasetsβ86Updated 3 years ago
- Crawler for linguistic corporaβ213Updated 4 months ago
- Datasets and tools for basic natural language processing.β387Updated 4 years ago
- Metadata and versioning details for the Common Voice datasetβ164Updated 3 weeks ago
- β22Updated 3 years ago
- β48Updated 8 years ago
- Convert Arpabet to IPA. Arpabet is the set of phonemes used by the CMU Pronouncing Dictionary. IPA is the International Phonetic Alphabetβ¦β44Updated 5 years ago
- A crash course for training speech recognition models using DeepSpeech.β24Updated 4 years ago
- Linguistic processing for Common Voiceβ58Updated last year
- Spoken Language Identification on Common Voice and AudioSet using Deep Learningβ40Updated 3 years ago
- A guide to building language technology in new languages.β59Updated 3 years ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).β66Updated 2 weeks ago
- Labeled data for homograph disambiguationβ62Updated 2 years ago
- ipapy is a Python module to work with International Phonetic Alphabet (IPA) stringsβ90Updated last year
- Microsoft Speech Language Translation (MSLT) Corpusβ19Updated 8 years ago
- Tool for creation, manipulation and maintenance of voice corporaβ82Updated last year
- British English pronunciation dictionaryβ98Updated 8 years ago
- CMUdict maintenance, and toolsβ237Updated last year
- Punctuation generation for speech transcripts using lexical and prosodic featuresβ41Updated 6 years ago
- pronunciation dictionaries for multiple languagesβ91Updated 8 years ago
- Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.β291Updated 9 months ago
- Collaborative data curation for Glottologβ182Updated this week
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.β56Updated 3 months ago
- Automatically exported from code.google.com/p/m2m-alignerβ42Updated 9 years ago
- Mozilla Voice Community Playbookβ48Updated last year