jumon / whisper-finetuning
[WIP] Scripts for fine-tuning Whisper
☆217Updated last year
Alternatives and similar repositories for whisper-finetuning:
Users that are interested in whisper-finetuning are comparing it to the libraries listed below
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆278Updated last year
- ☆70Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆107Updated last year
- Multilingual G2P in 100 languages☆296Updated last year
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆229Updated 2 weeks ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆298Updated 2 months ago
- ☆489Updated 6 months ago
- Text to speech alignment using CTC forced alignment☆206Updated last week
- PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor☆277Updated last year
- Easy-to-Use Speech MOS predictors☆256Updated last year
- unofficial vits2-TTS implementation in pytorch☆505Updated 10 months ago
- Predicts the level of noise and reverberation on your audiofiles☆144Updated 8 months ago
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆148Updated last year
- Finetune VITS and MMS using HuggingFace's tools☆131Updated 9 months ago
- a curated list of speech datasets (110+ datasets, 75+ easy to download)☆120Updated last year
- Various speech datasets made available to the public☆110Updated last month
- ☆334Updated 4 months ago
- Training code for FAcodec presented in NaturalSpeech3☆192Updated 5 months ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆158Updated 10 months ago
- Application of MB-iSTFT-VITS components to vits2_pytorch☆121Updated 2 months ago
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆184Updated 9 months ago
- Segment an audio file and obtain utterance alignments. (Python package)☆325Updated 8 months ago
- Official Implementation of StyleTTS☆414Updated 2 weeks ago
- EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction☆245Updated 8 months ago
- CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus☆195Updated 2 years ago
- Train the next generation of TTS systems.☆162Updated 4 months ago
- Synchronize Whisper's timestamps over an existing accurate transcription☆138Updated 8 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆78Updated last year
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch☆265Updated last year
- Unofficial implementation of NVIDIA P-Flow TTS paper☆220Updated last month