kaistmm / voxsim_trainerView external linksLinks
[INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset
☆12Sep 29, 2025Updated 4 months ago
Alternatives and similar repositories for voxsim_trainer
Users that are interested in voxsim_trainer are comparing it to the libraries listed below
Sorting:
- [ICASSP 2024] Official code for FreGrad☆35May 13, 2024Updated last year
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated last year
- ☆13Sep 12, 2024Updated last year
- Speech enhancement in noisy and reverberant environments using deep neural networks☆22Oct 10, 2025Updated 4 months ago
- Real-time end-to-end singing voice convertion☆23Nov 3, 2024Updated last year
- ☆52Jul 16, 2025Updated 6 months ago
- ☆27Sep 5, 2024Updated last year
- Streaming Vocos☆29Jun 10, 2025Updated 8 months ago
- HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz☆24Jan 2, 2024Updated 2 years ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- ☆10Dec 22, 2023Updated 2 years ago
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆15Aug 1, 2024Updated last year
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆10Sep 30, 2024Updated last year
- ☆11Nov 7, 2024Updated last year
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- ☆26Mar 20, 2024Updated last year
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- A collection of utilities for handling IPA phones.☆26Sep 24, 2023Updated 2 years ago
- Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction☆13Jul 22, 2024Updated last year
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 10 months ago
- Project of Singing Voice Conversion.☆15Oct 27, 2023Updated 2 years ago
- ☆15Nov 11, 2024Updated last year
- ☆14Aug 1, 2025Updated 6 months ago
- Official repo for DisCoder: High-Fidelity Music Vocoder using Neural Audio Codecs presented at ICASSP 2025☆37Feb 24, 2025Updated 11 months ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆27Mar 14, 2025Updated 10 months ago
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆32Apr 10, 2023Updated 2 years ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Jul 2, 2024Updated last year
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated 10 months ago
- My vocoder experiments☆31Jul 26, 2025Updated 6 months ago
- Multispeaker Community Vocoder Model for DiffSinger☆39Aug 11, 2025Updated 6 months ago
- [ICASSP2025] Official code for VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis☆52Apr 9, 2025Updated 10 months ago
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆36Jan 17, 2024Updated 2 years ago
- ☆15Mar 31, 2025Updated 10 months ago
- ☆14Aug 19, 2024Updated last year
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Jan 20, 2025Updated last year
- ☆24Mar 29, 2025Updated 10 months ago
- Clean and modernized implementation of FastSpeech2/LightSpeech using IPA☆18Aug 16, 2024Updated last year
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆34Nov 23, 2023Updated 2 years ago