WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models with PyTorch backend.
☆92Jun 6, 2021Updated 4 years ago
Alternatives and similar repositories for wavencoder
Users that are interested in wavencoder are comparing it to the libraries listed below
Sorting:
- The History of Speech Recognition to the Year 2030☆13Aug 14, 2021Updated 4 years ago
- Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)☆104Nov 26, 2022Updated 3 years ago
- Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-gramma…☆21Jan 24, 2022Updated 4 years ago
- ☆61Jan 31, 2023Updated 3 years ago
- Efficient Speech Processing Tookit for Automatic Speaker Recognition☆17Feb 8, 2023Updated 3 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆68May 12, 2021Updated 4 years ago
- Code for "Error-driven Fixed-Budget ASR Personalization for Accented Speakers" in ICASSP 2021☆11Jun 13, 2021Updated 4 years ago
- ☆11Nov 5, 2021Updated 4 years ago
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- The Additive Margin SincNet (AM-SincNet) is a new approach for speaker recognition problems which is based in the neural network architec…☆46Oct 3, 2023Updated 2 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- ☆229Nov 13, 2023Updated 2 years ago
- Wav2Vec for speech recognition, classification, and audio classification☆274Apr 2, 2022Updated 3 years ago
- Speaker embedding (d-vector) trained with GE2E loss☆286Jan 8, 2024Updated 2 years ago
- Voice conversion training with 109 speakers with limited training samples☆35Dec 21, 2020Updated 5 years ago
- A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder☆171Jul 25, 2024Updated last year
- Memory efficient transducer loss computation☆69Jun 10, 2022Updated 3 years ago
- Deep Discriminative Embeddings for Duration Robust Speaker Verification☆19Dec 16, 2019Updated 6 years ago
- ☆76Oct 25, 2021Updated 4 years ago
- Web page for ISCA Special Interest Group: Robust Speech Processing (RoSP)☆11Dec 4, 2023Updated 2 years ago
- A repository comprising of code for generation of noisy speech data from clean data using deep learning methods☆16Jul 12, 2021Updated 4 years ago
- A library for speech data augmentation in time-domain☆683Aug 30, 2021Updated 4 years ago
- One-shot TTS with Improved Unseen Speaker and Style Transfer☆37Mar 2, 2022Updated 4 years ago
- Torch-based tool for quantizing high-dimensional vectors using additive codebooks☆54May 25, 2022Updated 3 years ago
- ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for…☆43Dec 17, 2020Updated 5 years ago
- Prosodic Speech Segmentation with Transformers☆26Feb 25, 2024Updated 2 years ago
- Python wrapper for kaldi's arpa2fst☆38Aug 27, 2025Updated 6 months ago
- Simplified recipes for preparing commonly used speech datasets, and a PyTorch-compatible Python data loader that can perform standard fea…☆15Jun 12, 2023Updated 2 years ago
- ☆17Apr 28, 2021Updated 4 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆80May 20, 2023Updated 2 years ago
- ☆12Jun 10, 2021Updated 4 years ago
- [ICASSP'23] Online speaker clustering☆17Feb 22, 2026Updated last week
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆38Jun 12, 2023Updated 2 years ago
- [InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei …☆209Dec 8, 2022Updated 3 years ago
- TTS Text Analyzer☆32Jul 20, 2023Updated 2 years ago
- ☆28Oct 7, 2025Updated 4 months ago
- Incorporating AutoVocoder to MB-iSTFT-VITS☆48Dec 1, 2022Updated 3 years ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Mar 6, 2023Updated 2 years ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆16Mar 28, 2023Updated 2 years ago