ina-foss / InaGVAD
Voice activity detection and speaker gender segmentation audiovisual corpus
☆10Updated 2 months ago
Alternatives and similar repositories for InaGVAD:
Users that are interested in InaGVAD are comparing it to the libraries listed below
- C++ version of pyannote audio overlapped speech detection pipeline☆12Updated last year
- Forced alignment decoder for Whisper.☆14Updated last year
- ☆13Updated 5 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆20Updated 2 weeks ago
- ☆12Updated 2 months ago
- ☆13Updated 7 months ago
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆12Updated 2 weeks ago
- Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.☆12Updated 4 months ago
- Production-ready vocoder using BigVSAN☆11Updated last year
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 2 years ago
- source code of EfficientTTS 2☆12Updated last year
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆12Updated 6 months ago
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆16Updated 10 months ago
- ☆10Updated 4 months ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆20Updated last year
- ☆11Updated 2 years ago
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆15Updated 5 months ago
- Just another FastSpeech 2 but cleaner code :)☆26Updated 9 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- ☆10Updated last month
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆14Updated 3 months ago
- A collection of all our phonemeizers for dataset construction and inference☆22Updated last month
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆15Updated 5 months ago
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Updated 6 months ago
- MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline. (Accepted by IALP'2022)☆18Updated 2 years ago
- StyleTTS 2 Optimized Training Fork☆26Updated last month
- Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering☆19Updated last year
- Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…☆25Updated 6 months ago
- A simple command line tool to calculate WER for ASR.☆14Updated 5 months ago