freds0/kabooks

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/freds0/kabooks)

freds0 / kabooks

KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using audiobooks, KABooks will generate dataset with segmented audios and aligned texts.

☆13

Alternatives and similar repositories for kabooks

Users that are interested in kabooks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
bryan051003 / USVG
View on GitHub
A unified model for zero-shot singing voice conversion and synthesis
☆22Nov 30, 2022Updated 3 years ago
maum-ai / sane-tts
View on GitHub
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
☆11Jun 30, 2023Updated 3 years ago
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
kcvinker / Glance
View on GitHub
GUI Library for AutoIt, based on windows API.
☆12Jun 16, 2023Updated 3 years ago
hljodbokasafnid / Ascanius
View on GitHub
Automates the creation of full-text (sound and text) ebooks in epub/epub3/daisy format, the webserver/client creates smil files to sync a…
☆10Nov 12, 2021Updated 4 years ago
p0p4k / vits3_pytorch
View on GitHub
☆28Nov 15, 2023Updated 2 years ago
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
jdevera / pylabeador
View on GitHub
A Python library and CLI tool to do automatic syllabification of Spanish words
☆15Jul 12, 2026Updated last week
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
FabioSmuu / Base-DiscordBot
View on GitHub
Esta é a minha handler para criar bot de disord.
☆11Sep 5, 2024Updated last year
printf-ana / python-curso-em-video
View on GitHub
Exercícios de Python referente ao Mundo 1 do Curso em Vídeo, do Gustavo Guanabara.
☆16May 19, 2020Updated 6 years ago
kjw11 / CSEnet-ASR
View on GitHub
Cross-Speaker Encoding Network for Multi-talker Speech Recognition
☆12Mar 14, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
llm-lab-org / CLASP
View on GitHub
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
☆13Jun 27, 2025Updated last year
pilarOG / unit_selection_tts
View on GitHub
Toy example on how to build a unit selection TTS in Spanish
☆11May 10, 2019Updated 7 years ago
leo-buneev / eslint-plugin-md
View on GitHub
Allows you to lint markdown code in your *.md files.
☆18Jan 5, 2023Updated 3 years ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Oct 14, 2024Updated last year
bookbot-hive / k2-indonesian-asr
View on GitHub
Indonesian speech/phoneme recognizer powered by Kaldi 2.0 (lhotse, icefall, sherpa).
☆16Jun 30, 2023Updated 3 years ago
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
kevinsmithwebdev-zz / react-lectorem
View on GitHub
a React library to sync audio and text, highlighting the text as the audio "reads" it
☆13Jul 7, 2023Updated 3 years ago
google / airdialogue_model
View on GitHub
☆17Jul 16, 2020Updated 6 years ago
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Dec 3, 2024Updated last year
gbegus / DeepPhonologyTool
View on GitHub
Train a fiwGAN or ciwGAN model using your own training data
☆14Oct 13, 2022Updated 3 years ago
mush42 / istft-onnx
View on GitHub
Export an ONNX graph that performs ISTFT. Designed for TTS models.
☆28Apr 23, 2024Updated 2 years ago
v-manhlt3 / m-LTM-Audio-Text-Retrieval
View on GitHub
☆13Jan 5, 2025Updated last year
DongKeon / webrtc-whisper-asr
View on GitHub
WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.
☆13Sep 27, 2024Updated last year
SidU / teams-chatgpt-js
View on GitHub
Sample bot to demonstrate how to integrate with the new gpt-3.5 turbo model that power ChatGPT
☆12Mar 3, 2023Updated 3 years ago
huutuongtu / Lightvoc
View on GitHub
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
☆18May 17, 2024Updated 2 years ago
jhuang448 / MultilingualALT
View on GitHub
Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated 2 years ago
zelaki / DisfluentFA
View on GitHub
A Weakly Supervised Forced Alignment for disluent speech
☆15Nov 12, 2023Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Jackson-Kang / Prosody-augmentation-for-Text-to-speech
View on GitHub
Simple tool for speech dataset augmentation for modeling various prosodies.
☆14Jan 14, 2021Updated 5 years ago
alefiury / SE-R-2022-SER-Track
View on GitHub
Code for the winning solution in the SE&R 2022 Challenge - SER track.
☆16Mar 28, 2023Updated 3 years ago
deepakbaby / isegan
View on GitHub
Improved Speech Enhancement GANs
☆13Jun 24, 2020Updated 6 years ago
pezzetti / ie-validator
View on GitHub
Validator for Inscrição Estadual (IE)
☆11Dec 17, 2020Updated 5 years ago
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
NTRLab / MediaSpeech
View on GitHub
☆22Jul 22, 2022Updated 4 years ago
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year