Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
☆141Aug 3, 2023Updated 2 years ago
Alternatives and similar repositories for GPV
Users that are interested in GPV are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The codebase for Data-driven general-purpose voice activity detection.☆93Aug 3, 2023Updated 2 years ago
- Materials of public talks given By SJTU X-LANCE members☆14Dec 3, 2022Updated 3 years ago
- Repo for our pooling approach on the DCASE2018 task4☆15Jul 6, 2023Updated 2 years ago
- Code for reproducing experiments in "Domain-Adversarial Voice Activity Detection"☆23Mar 3, 2020Updated 6 years ago
- Permutation invariant training in PyTorch☆13Oct 2, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆53May 15, 2025Updated 10 months ago
- Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.☆869Jun 9, 2021Updated 4 years ago
- Voice Activity Detection based on Deep Learning & TensorFlow☆371Mar 24, 2023Updated 3 years ago
- Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021☆159Oct 26, 2021Updated 4 years ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆151Jun 5, 2025Updated 9 months ago
- Benchmark popular audio i/o packages☆151Dec 19, 2023Updated 2 years ago
- Simple DNN based Voice Activity Detection (VAD) using Pytorch☆42Feb 8, 2020Updated 6 years ago
- ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS☆33Mar 16, 2023Updated 3 years ago
- A python library for voice activity detection (VAD) for speech/non-speech segmentation.☆88Sep 7, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Exploring Binary Classification Loss for Speaker Verification☆18Jul 18, 2023Updated 2 years ago
- Speech enhancement using mimic loss☆16Oct 25, 2019Updated 6 years ago
- Python loaders for many Real Room Impulse Response databases☆96Sep 30, 2024Updated last year
- Voice Activity Detection (VAD) using deep learning.☆204Oct 14, 2019Updated 6 years ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆15Mar 26, 2022Updated 4 years ago
- End-to-End Neural Diarization☆423Aug 30, 2021Updated 4 years ago
- A library for speech data augmentation in time-domain☆684Aug 30, 2021Updated 4 years ago
- Diarization scoring tools.☆262Mar 28, 2023Updated 2 years ago
- A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.☆1,853Jul 22, 2025Updated 8 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Speech Dereverberation using Fully Convolutional Networks☆77Aug 18, 2020Updated 5 years ago
- A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR☆1,043Jul 5, 2023Updated 2 years ago
- This is the code of the ICASSP 2020 paper "Joint phoneme alignment and text-informed speech separation on highly corrupted speech"☆15Apr 8, 2024Updated last year
- Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.☆523Feb 17, 2022Updated 4 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- Production First and Production Ready End-to-End Text-to-Speech Toolkit☆417Nov 20, 2025Updated 4 months ago
- [InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei …☆209Dec 8, 2022Updated 3 years ago
- ☆76Oct 25, 2021Updated 4 years ago
- Implementation of Neural PLDA (NPLDA) model (A discriminative backend for Speaker Verification)☆100Apr 20, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"☆125Apr 8, 2022Updated 3 years ago
- ☆555Jun 11, 2021Updated 4 years ago
- Python library for Room Impulse Response (RIR) simulation with GPU acceleration☆586Jul 18, 2025Updated 8 months ago
- Tools for Speech Enhancement integrated with Kaldi☆428Jul 6, 2023Updated 2 years ago
- MagicData-RAMC Dataset and Baseline☆58Sep 13, 2022Updated 3 years ago
- Big Impulse Response Dataset☆156Oct 19, 2022Updated 3 years ago
- Repo associated to the DESED dataset, download and creation of data☆146Jul 16, 2024Updated last year