bigcash / awesome-vad
A curated list of awesome voice activity detection
☆29Updated 2 months ago
Alternatives and similar repositories for awesome-vad:
Users that are interested in awesome-vad are comparing it to the libraries listed below
- Implementation of Google's USM speech model in Pytorch☆27Updated this week
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 7 months ago
- Spectral Mapping of Singing Voices: U-Net-Assisted Vocal Segmentation☆12Updated last month
- Official Code for ParrotTTS☆46Updated 3 months ago
- Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.☆35Updated last year
- ☆21Updated 5 months ago
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated last year
- Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with the…☆47Updated 2 years ago
- StyleTTS 2 Optimized Training Fork☆18Updated this week
- Project of Singing Voice Conversion.☆14Updated last year
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆30Updated last year
- ☆28Updated last year
- Adaptive Vocoder for Custom Voice☆59Updated 2 years ago
- Export an ONNX graph that performs ISTFT. Designed for TTS models.☆23Updated 9 months ago
- ☆56Updated 2 years ago
- Python bindings of speexdsp noise suppression library☆36Updated 2 years ago
- Zero-Shot Foreign Accent Conversion without a Native Reference☆28Updated 8 months ago
- ☆12Updated 2 years ago
- Tunable pipelines☆31Updated 2 weeks ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 2 years ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- BEGANSing - Korean SVS + SVC + AudioSR☆12Updated 11 months ago
- ☆12Updated 5 months ago
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆69Updated 2 years ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆39Updated last year
- Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale☆27Updated last year
- ☆31Updated 9 months ago
- An implementation for Frame-level Speech Signal-to-Noise Ratio Estimation using deep learning☆35Updated 2 years ago
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆28Updated 3 months ago
- Simple PyTorch Denoisers for Waveform Audio☆34Updated last month