nc-ai/speech

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nc-ai/speech)

nc-ai / speech

☆17

Alternatives and similar repositories for speech

Users that are interested in speech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yc9701 / pansori-tedxkr-corpus
View on GitHub
Korean ASR Corpus generated from TEDx talks
☆27Jan 11, 2019Updated 7 years ago
dafyddg / RFA
View on GitHub
Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectr…
☆17Apr 27, 2023Updated 3 years ago
rhoposit / icassp2021
View on GitHub
☆15May 8, 2021Updated 5 years ago
revsic / torch-retriever-vc
View on GitHub
PyTorch implementation of Retriever: Learning Content-Style Representation
☆12Jan 27, 2023Updated 3 years ago
Kyubyong / kss
View on GitHub
☆70Jan 7, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
RF5 / transfusion-asr
View on GitHub
Transcribing Speech with Multinomial Diffusion, training code and models.
☆80Sep 27, 2023Updated 2 years ago
zldzmfoq12 / VCtube
View on GitHub
A pakage for crawling audio from Youtube
☆42Aug 8, 2023Updated 2 years ago
ga642381 / RobustVC
View on GitHub
**ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degrada…
☆24Sep 27, 2022Updated 3 years ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
fabianoluzbr / neural-g2p-portuguese
View on GitHub
Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly es…
☆19Jun 14, 2021Updated 5 years ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
mutiann / speech_rankings
View on GitHub
A CSRankings-like index for speech researchers
☆35Oct 16, 2024Updated last year
kakaobrain / jejueo
View on GitHub
Jejueo Datasets for Machine Translation and Speech Synthesis
☆82Feb 19, 2020Updated 6 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
LEEYOONHYUNG / BVAE-TTS
View on GitHub
Official implementation of BVAE-TTS
☆173Sep 26, 2022Updated 3 years ago
odunola499 / f5-lora
View on GitHub
☆19Nov 18, 2025Updated 8 months ago
Jackson-Kang / Prosody-augmentation-for-Text-to-speech
View on GitHub
Simple tool for speech dataset augmentation for modeling various prosodies.
☆14Jan 14, 2021Updated 5 years ago
keonlee9420 / Robust_Fine_Grained_Prosody_Control
View on GitHub
PyTorch Implementation of Robust and fine-grained prosody control of end-to-end speech synthesis
☆41Feb 20, 2022Updated 4 years ago
PlayVoice / VI-Speaker
View on GitHub
Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.
☆30Sep 16, 2022Updated 3 years ago
richardbaihe / a3t
View on GitHub
Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
☆89Sep 6, 2024Updated last year
Yangyangii / TPGST-Tacotron
View on GitHub
Google's TPGST reimplementation.
☆34Dec 11, 2019Updated 6 years ago
isjwdu / DFADD
View on GitHub
Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset
☆16Apr 7, 2025Updated last year
sarulab-speech / lightweight_spkr_anon
View on GitHub
Lightweight speaker anonymization [IEEE SLT2021]
☆27Jun 6, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
revsic / torch-nansy
View on GitHub
Torch implementation of NANSY, Neural Analysis and Synthesis, arXiv:2110.14513
☆64Feb 13, 2023Updated 3 years ago
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
Yeongtae / tacotron2
View on GitHub
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
☆30May 28, 2020Updated 6 years ago
pilot7747 / VoxDIY
View on GitHub
This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.
☆16Jul 22, 2021Updated 5 years ago
Yangyangii / DeepConvolutionalTTS-pytorch
View on GitHub
Deep Convolutional TTS pytorch implementation
☆27Jul 2, 2019Updated 7 years ago
KrishnaDN / BERTphone
View on GitHub
Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"
☆17Dec 10, 2020Updated 5 years ago
homink / speech.ko
View on GitHub
Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language
☆43Feb 28, 2018Updated 8 years ago
Deepest-Project / AlignTTS
View on GitHub
Implementation of the AlignTTS
☆77Jul 6, 2023Updated 3 years ago
HGU-DLLAB / Korean-FastSpeech2-Pytorch
View on GitHub
Implementation of Korean FastSpeech2
☆215Jan 29, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
YuvalBecker / MelNet
View on GitHub
MelNet-Tensorflow implementation
☆41Dec 1, 2020Updated 5 years ago
dathudeptrai / FastSpeech2
View on GitHub
A Tensorflow Implementation of the FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
☆11Aug 12, 2020Updated 5 years ago
clovaai / ClovaCall
View on GitHub
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)
☆223Apr 5, 2022Updated 4 years ago
yl4579 / PitchExtractor
View on GitHub
Deep Neural Pitch Extractor for Voice Conversion and TTS Training
☆151Aug 22, 2022Updated 3 years ago
alumae / streaming-punctuator
View on GitHub
☆17Apr 14, 2023Updated 3 years ago
sarulab-speech / multi-speaker-dgp
View on GitHub
Official implementation of DGP-based multi-speaker speech synthesis with PyTorch
☆24Mar 23, 2021Updated 5 years ago