jumon/whisper-finetuning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jumon/whisper-finetuning)

jumon / whisper-finetuning

[WIP] Scripts for fine-tuning Whisper

☆221

Alternatives and similar repositories for whisper-finetuning

Users that are interested in whisper-finetuning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

krylm / whisper-event-tuning
View on GitHub
Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.
☆12Dec 24, 2022Updated 3 years ago
thuhcsi / SnakeGAN
View on GitHub
Please visit https://thuhcsi.github.io/SnakeGAN/
☆37Apr 25, 2023Updated 3 years ago
sarulab-speech / whisper-asr-finetune
View on GitHub
☆32Dec 4, 2022Updated 3 years ago
jumon / whisper-punctuator
View on GitHub
Zero-shot multimodal punctuation insertion and truecasing using Whisper
☆120Feb 4, 2023Updated 3 years ago
vasistalodagala / whisper-finetune
View on GitHub
Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.
☆365May 23, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
xinjli / alqalign
View on GitHub
multilingual speech aligner
☆78Nov 19, 2023Updated 2 years ago
bayartsogt-ya / whisper-multiple-hf-datasets
View on GitHub
Whisper fine-tuning event script to use multiple hf datasets
☆32Dec 20, 2022Updated 3 years ago
CODEJIN / VITS_Diffusion
View on GitHub
☆26Sep 22, 2022Updated 3 years ago
fss1t / CausalStarGANv2-VC
View on GitHub
☆22Apr 4, 2023Updated 3 years ago
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
spring-media / DeepForcedAligner
View on GitHub
☆81Aug 8, 2025Updated 11 months ago
yeyupiaoling / Whisper-Finetune
View on GitHub
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training wit…
☆1,220May 8, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
lesterphillip / SVCC23_FastSVC
View on GitHub
Singing Voice Conversion Challenge 2023 Starter Kit: FastSVC Reimplementation
☆116Nov 25, 2023Updated 2 years ago
jasonppy / PromptingWhisper
View on GitHub
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
☆151Jan 16, 2024Updated 2 years ago
alefiury / SE-R-2022-SER-Track
View on GitHub
Code for the winning solution in the SE&R 2022 Challenge - SER track.
☆16Mar 28, 2023Updated 3 years ago
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated 2 years ago
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
CODEJIN / XiaoiceSing2
View on GitHub
☆19Feb 2, 2023Updated 3 years ago
Srijith-rkr / Whispering-LLaMA
View on GitHub
EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction
☆271May 19, 2024Updated 2 years ago
yl4579 / StyleTTS-VC
View on GitHub
Official Implementation of StyleTTS-VC
☆200Jan 14, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MiniXC / LightningFastSpeech2
View on GitHub
☆55Jan 13, 2023Updated 3 years ago
sushant-t / tts-trainer
View on GitHub
Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…
☆30May 27, 2023Updated 3 years ago
cnaigithub / SpeechDewarping
View on GitHub
Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023
☆27Apr 27, 2023Updated 3 years ago
WangHelin1997 / DuTa-VC
View on GitHub
Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…
☆38Dec 5, 2023Updated 2 years ago
HKAB / whisper-finetune-vietnamese
View on GitHub
Whisper finetuned on VinBigdata-VLSP2020-100h + KenLM
☆38Oct 6, 2023Updated 2 years ago
y-chan / hifi-gan-misrnet
View on GitHub
unofficial pytorch implementation of HiFi-GAN with fast MISR.
☆15Mar 21, 2023Updated 3 years ago
thuhcsi / NeuFA
View on GitHub
Neural network-based forced alignment with bidirectional attention mechanism
☆78Jan 17, 2025Updated last year
MelissaChen15 / control-vc
View on GitHub
This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"
☆132Nov 29, 2023Updated 2 years ago
hcy71o / AutoVocoder
View on GitHub
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing
☆71Dec 2, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
lifeiteng / NaturalSpeech2
View on GitHub
☆33Jun 29, 2023Updated 3 years ago
PlayVoice / VI-SVS
View on GitHub
Singing Voice Synthesis based on VITS, different from VISinger
☆198Nov 13, 2023Updated 2 years ago
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
alumae / streaming-punctuator
View on GitHub
☆17Apr 14, 2023Updated 3 years ago
revsic / torch-nansy
View on GitHub
Torch implementation of NANSY, Neural Analysis and Synthesis, arXiv:2110.14513
☆64Feb 13, 2023Updated 3 years ago
adelacvg / NS2VC
View on GitHub
Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech
☆237Feb 29, 2024Updated 2 years ago
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago