Wataru-Nakata / miipherLinks
Unofficial implementation of miipher
☆131Updated last year
Alternatives and similar repositories for miipher
Users that are interested in miipher are comparing it to the libraries listed below
Sorting:
- Reference-aware automatic speech evaluation toolkit☆163Updated 9 months ago
- UTokyo-SaruLab MOS Prediction System☆233Updated this week
- MOS score prediction by fine-tuned wav2vec2.0 model☆165Updated 2 years ago
- Speech Human Evaluation Estimation Toolkit (SHEET)☆104Updated this week
- UT-Sarulab MOS prediction system using SSL models☆262Updated last year
- Pytorch implementation of BigVSAN☆203Updated last year
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning☆150Updated last year
- An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"☆136Updated 2 years ago
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆129Updated last year
- S3PRL-VC: A Voice Conversion Toolkit based on S3PRL☆101Updated last year
- High-Fidelity Neural Phonetic Posteriorgrams☆115Updated 6 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆204Updated last year
- Official implementation of the source-filter HiFiGAN vocoder☆260Updated 2 years ago
- A sequence-to-sequence voice conversion toolkit.☆102Updated last year
- ☆102Updated 2 years ago
- High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec☆108Updated 2 months ago
- Easy-to-Use Speech MOS predictors☆311Updated last year
- Expressive Anechoic Recordings of Speech (EARS)☆187Updated last year
- [InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter☆92Updated last year
- HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform☆209Updated 7 months ago
- HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement☆155Updated 3 years ago
- Unofficial implementation of NVIDIA P-Flow TTS paper☆228Updated 8 months ago
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆123Updated last year
- ☆154Updated 11 months ago
- The TTSDS benchmark evaluates synthetic speech quality by considering prosody, speaker identity, and intelligibility, comparing these fac…☆63Updated last month
- NOTSOFAR-1 Challenge: Distant Diarization and ASR☆56Updated 7 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆94Updated 8 months ago
- ☆145Updated 4 months ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆172Updated last year
- Predicts the level of noise and reverberation on your audiofiles☆160Updated 2 months ago