tiefenauer / forced-alignmentLinks

Forced alignment based on speech pauses using an RNN

☆9

Alternatives and similar repositories for forced-alignment

Users that are interested in forced-alignment are comparing it to the libraries listed below

Sorting:

miccio-dk / NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Updated 3 years ago
KrishnaDN / BERTphone
Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"
☆17Updated 4 years ago
anjandeepsahni / speech_phoneme_prediction
Phoneme prediction from speech mel-spectrograms using RNN.
☆14Updated 6 years ago
MiniXC / phones
A collection of utilities for handling IPA phones.
☆25Updated last year
hcy71o / AutoVocoder
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing
☆70Updated 2 years ago
speechio / asr-noises
A handy dataset of noises for ASR
☆21Updated 6 years ago
huiw39 / ExtensibleTTS-PyTorch
An extensible speech synthesis system, build with PyTorch and the original code is from r9y9's https://github.com/r9y9/nnmnkwii_gallery
☆26Updated 5 years ago
MingjieChen / LowResourceVC
Voice conversion training with 109 speakers with limited training samples
☆35Updated 4 years ago
tts-tutorial / icassp2022
☆64Updated 3 years ago
MiniXC / LightningFastSpeech2
☆56Updated 2 years ago
espnet / espnet_tts_frontend
Text frontend for ESPnet tts recipes
☆31Updated 4 years ago
candlewill / RawNet
RawNet: Fast End-to-End Neural Vocoder
☆42Updated 6 years ago
ljuvela / GELP
☆26Updated 4 years ago
iisys-hof / HUI-Audio-Corpus-German
This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…
☆31Updated 2 years ago
ajinkyakulkarni14 / ERISHA
ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for…
☆43Updated 4 years ago
AndreevP / speech_distances
Deep Speech Distances PyTorch
☆28Updated 3 years ago
yoyolicoris / pytorch_FFTNet
A pytorch implementation of FFTNet.
☆37Updated 6 years ago
vliu15 / adversarial-tts
End-to-end Text-to-Speech with Generative Adversarial Networks
☆20Updated 4 years ago
babe269 / performant
A toolset for easy formant extraction and visualization from wav files and TTS models
☆31Updated 2 years ago
MTG / PodcastMix-inference
☆32Updated 3 years ago
brentspell / torch-yin
Yin pitch estimator in PyTorch
☆114Updated 2 years ago
maxrmorrison / promonet
Prosody and Pronunciation Modification Network
☆54Updated last month
luomingshuang / k2-speechbrain
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Updated 2 years ago
AppleHolic / multiband_melgan
An unofficial implementation of https://arxiv.org/abs/2005.05106
☆46Updated 4 years ago
maxrmorrison / torbi
Viterbi decoding in PyTorch
☆34Updated 3 weeks ago
oleges1 / quartznet-pytorch
Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]
☆27Updated 3 years ago
choiHkk / FastSpeech2-cwt
with alignment learning and continuous wavelet transform
☆21Updated 2 years ago
jefflai108 / Unsupervised-TTS
☆42Updated 3 years ago
BirdVox / PCEN-SNR
Audio activity detector based on per-channel energy normalization (PCEN)
☆29Updated 6 years ago
archinetai / aligner-pytorch
Sequence alignement methods with helpers for PyTorch.
☆24Updated 2 years ago