coqui-ai/data-checker

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/coqui-ai/data-checker)

coqui-ai / data-checker

🫠 check your data, before you wreck your model

☆16

Alternatives and similar repositories for data-checker

Users that are interested in data-checker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

coqui-ai / open-bible-scripts
View on GitHub
scipts for working with open.bible data
☆26Jan 24, 2022Updated 4 years ago
alpoktem / bible2speechDB
View on GitHub
Scripts to create speech corpora from open.bible
☆13Jan 3, 2022Updated 4 years ago
miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
domcross / german-stt-evaluation
View on GitHub
Evaluation of STT models for german language
☆16Jan 22, 2022Updated 4 years ago
iisys-hof / HUI-Audio-Corpus-German
View on GitHub
This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…
☆35Mar 31, 2023Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
neulab / AfricanVoices
View on GitHub
Hosts text-to-speech corpus and speech synthesizers for African languages.
☆19May 31, 2023Updated 3 years ago
ftyers / commonvoice-utils
View on GitHub
Linguistic processing for Common Voice
☆59Jan 18, 2024Updated 2 years ago
KathyReid / opensource-voice-tools
View on GitHub
A repo listing known open source voice tools, ordered by where they sit in the voice stack
☆28Sep 23, 2022Updated 3 years ago
Lelapa-AI / zindi-inkuba-notebook
View on GitHub
☆11Mar 7, 2025Updated last year
mozilla / deepspeech-playbook
View on GitHub
DEPRECATED - A crash course for training speech recognition models using DeepSpeech.
☆24May 16, 2021Updated 5 years ago
RichardLitt / stranded-by-trump
View on GitHub
Helping travelers stranded by Trump
☆11Oct 5, 2022Updated 3 years ago
typotheque / syllabics-knowledge
View on GitHub
open source knowledge for Syllabics font design and development
☆10Nov 13, 2024Updated last year
repodiac / german_transliterate
View on GitHub
Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to…
☆39Jan 16, 2021Updated 5 years ago
rhasspy / glow-speak
View on GitHub
Neural text to speech system that uses eSpeak as a text/phoneme front-end
☆16Oct 20, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
getalp / mass-dataset
View on GitHub
MaSS - Multilingual corpus of Sentence-aligned Spoken utterances
☆50Sep 16, 2024Updated last year
CSTR-Edinburgh / qualtreats
View on GitHub
Qualtric or Qualtreat? Generate Qualtrics listening tests for Text-To-Speech evaluations.
☆36Jun 25, 2024Updated 2 years ago
chdh / klatt-syn-app
View on GitHub
GUI applikation for the Klatt formant synthesizer package
☆13Jun 26, 2026Updated last month
roedoejet / mothertongues
View on GitHub
Mother Tongues Dictionaries dictionary creation tool
☆15May 21, 2024Updated 2 years ago
roedoejet / convertextract
View on GitHub
Extract and find/replace text based on arbitrary correspondences while preserving original file formatting. This library is a fork from t…
☆11Sep 8, 2023Updated 2 years ago
ChanceNCounter / awesome-mycroft-community
View on GitHub
Awesome stuff made by the Mycroft community
☆12Sep 16, 2021Updated 4 years ago
indonesian-nlp / wav2vec2-indonesian
View on GitHub
☆20Apr 5, 2021Updated 5 years ago
freds0 / katube
View on GitHub
KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a l…
☆26Jul 27, 2024Updated last year
rasenganai / Illegal_Parking
View on GitHub
Using AI based approach to detect illegal parking of vehicles (Cars) from an image. The model will receive an image of parked car through…
☆11Jun 2, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
verma-anushka / Gaming-Zone
View on GitHub
The Gaming Zone is a web application that provides you with a collection of classic retro games, including puzzle games, trivia games, bo…
☆10Feb 11, 2020Updated 6 years ago
proger / uk
View on GitHub
Фонограми та синтагми: інструменти обробки
☆21Jun 21, 2025Updated last year
Speech-Lab-IITM / data2vec-aqc
View on GitHub
Repository having the code and models from the paper: data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student traini…
☆13Mar 18, 2024Updated 2 years ago
QxLabIreland / listening-test
View on GitHub
An open source platform for browser based speech and audio subjective quality tests.
☆40Updated this week
speechio / asr-noises
View on GitHub
A handy dataset of noises for ASR
☆22May 29, 2019Updated 7 years ago
rendchevi / nix-tts
View on GitHub
🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation
☆264Nov 15, 2025Updated 8 months ago
olastor / german-word-frequencies
View on GitHub
Simple word to frequency mappings for the german language based on text corpora and using CISTEM stemmer.
☆14Apr 3, 2021Updated 5 years ago
eddieantonio / unicode-default-word-boundary
View on GitHub
Split words with Unicode's default word boundary specification
☆13Sep 12, 2024Updated last year
nils-werner / pymushra
View on GitHub
pyMUSHRA is a python web application which hosts webMUSHRA experiments and collects the data with python.
☆47Jul 3, 2026Updated 3 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
nrc-cnrc / gramble
View on GitHub
Domain-specific programming language for linguistic grammars and transducers — Langage dédié pour les grammaires linguistiques et les tra…
☆17Updated this week
Ahmad-Alaziz / ChadGPT
View on GitHub
An experimental open-source attempt to make GPT-4 fully autonomous.
☆27Oct 27, 2023Updated 2 years ago
neulab / newlang-tech
View on GitHub
A guide to building language technology in new languages.
☆59Feb 1, 2022Updated 4 years ago
smaybius / Coqui-TTS-GUI-solution
View on GitHub
Interface for using TTS and vocoder models in the form of a text editor
☆20Nov 25, 2025Updated 8 months ago
ivandotv / nextjs-koa-api
View on GitHub
Koa.js framework setup to run within Next.js API routes.
☆11May 23, 2026Updated 2 months ago
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
vincentqb / audio-tutorial
View on GitHub
Experiments and tutorials with and for torchaudio
☆13May 7, 2021Updated 5 years ago