jumon / whisper-finetuningLinks
[WIP] Scripts for fine-tuning Whisper
☆220Updated 2 years ago
Alternatives and similar repositories for whisper-finetuning
Users that are interested in whisper-finetuning are comparing it to the libraries listed below
Sorting:
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆308Updated 2 years ago
- ☆79Updated last year
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆251Updated 4 months ago
- unofficial vits2-TTS implementation in pytorch☆523Updated last year
- Segment an audio file and obtain utterance alignments. (Python package)☆336Updated last year
- Update ASR paper everyday☆222Updated this week
- Finetune VITS and MMS using HuggingFace's tools☆154Updated last year
- Predicts the level of noise and reverberation on your audiofiles☆150Updated last year
- ☆103Updated this week
- Various speech datasets made available to the public☆118Updated 5 months ago
- a curated list of speech datasets (110+ datasets, 75+ easy to download)☆131Updated 2 years ago
- Easy-to-Use Speech MOS predictors☆288Updated last year
- NeMo text processing for ASR and TTS☆335Updated 2 weeks ago
- ☆359Updated 8 months ago
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆200Updated last year
- Unofficial implementation of NVIDIA P-Flow TTS paper☆223Updated 5 months ago
- ☆520Updated 10 months ago
- Fine-Tune Whisper with Transformers and PEFT☆57Updated last year
- 🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.☆244Updated 11 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆194Updated 8 months ago
- Training code for FAcodec presented in NaturalSpeech3☆209Updated 9 months ago
- Application of MB-iSTFT-VITS components to vits2_pytorch☆126Updated 6 months ago
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆416Updated 2 months ago
- Train the next generation of TTS systems.☆165Updated 8 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆114Updated 2 years ago
- Multilingual G2P in 100 languages☆327Updated 2 years ago
- Variational Bayes HMM over x-vectors diarization☆270Updated last year
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch☆270Updated last year
- Official Implementation of StyleTTS☆432Updated 4 months ago
- UT-Sarulab MOS prediction system using SSL models☆237Updated last year