i4Ds/whisper-finetune

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/i4Ds/whisper-finetune)

i4Ds / whisper-finetune

This repository contains code for fine-tuning the Whisper speech-to-text model.

☆24

Alternatives and similar repositories for whisper-finetune

Users that are interested in whisper-finetune are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

i4Ds / whisper-prep
View on GitHub
Data preparation utility for the finetuning of OpenAI's Whisper model.
☆16Jun 18, 2026Updated last month
Ranjan-Shettigar / Skin-Cancer-Detection-Classification
View on GitHub
Skin cancer classification project using deep learning techniques for automated diagnosis of skin lesions.
☆11Jun 2, 2024Updated 2 years ago
tim-roderick / VST
View on GitHub
Video Summarization Transformer: Implementation in PyTorch of the Transformer model for video summarisation
☆10Oct 27, 2020Updated 5 years ago
ggambetta / mlt2fcp
View on GitHub
MLT (Kdenlive, others) to FCP (Final Cut Pro, Davinci Resolve, others) video project converter
☆10Apr 30, 2019Updated 7 years ago
zqlsnr / DPCRN
View on GitHub
real-time speech enhance
☆18Jan 23, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
msalhab96 / AraSpell
View on GitHub
A framework for Arabic spelling correction using different seq2seq model architectures such as transformers and RNNs
☆25Jul 21, 2024Updated 2 years ago
tatianapassali / artificial-disfluency-generation
View on GitHub
Generating artificial disfluencies from fluent text easily and promptly
☆16Sep 28, 2022Updated 3 years ago
mohan696matlab / whisper-finetuning-youtube-serise
View on GitHub
☆16May 14, 2025Updated last year
arda-num / SFSRNet
View on GitHub
Reproduction of the paper SFSRNet: Super-resolution for single-channel Audio Source Separation by me (@arda-num) and @dritx16. Navigate P…
☆12Jul 7, 2022Updated 4 years ago
siddh30 / The-Airbnb-Classification-Project
View on GitHub
This project is from the Airbnb Recruitment Challenge on Kaggle. The challenge is to solve a multi-class classification problem of predic…
☆11Feb 22, 2022Updated 4 years ago
Mildemelwe / Non-English-Tacotron-2-Training-Notebook
View on GitHub
Tacotron 2 training notebook supporting Japanese, French, and Mandarin
☆11Nov 19, 2022Updated 3 years ago
xjuspeech / YOLOPitch
View on GitHub
☆10Jun 11, 2024Updated 2 years ago
carlosabalde / mobiledetect2vcl
View on GitHub
Python script to transform the Mobile Detect JSON database into an UA-based mobile detection VCL subroutine easily integrable in any Varn…
☆14Nov 13, 2023Updated 2 years ago
AlonAzrael / keras-aquarium
View on GitHub
a small collection of models implemented in keras, including matrix factorization(recommendation system), topic modeling, text classifica…
☆14Jul 12, 2017Updated 9 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
dreji18 / Fine-tune-Speech-Recognition
View on GitHub
Tutorial on how to train a custom voice recognition model using Hugging face models.
☆11Jul 2, 2023Updated 3 years ago
HLasse / multidiagnosis-speech
View on GitHub
☆10Jun 23, 2023Updated 3 years ago
WangHelin1997 / GL-AT
View on GitHub
Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.
☆13Feb 6, 2021Updated 5 years ago
k2-fsa / kaldi-decoder
View on GitHub
Decoders from Kaldi using OpenFst
☆35Apr 10, 2026Updated 3 months ago
deepberlin1 / aiforgood2020
View on GitHub
General information about DEEP BERLIN's AI for Good Hackathon 2020
☆11Apr 14, 2020Updated 6 years ago
iamcam / ai-wordpress-rag-demo
View on GitHub
This small project demonstrates how to integrate WordPress blog entries into queries for a RAG-based (Retriever-Augmented Generation) lan…
☆11Apr 2, 2024Updated 2 years ago
jakariaemon / WSI
View on GitHub
Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.
☆26Jun 29, 2026Updated last month
rosinality / melgan-pytorch
View on GitHub
MelGAN and Tacotron 2 in PyTorch
☆11Oct 22, 2019Updated 6 years ago
flaport / torch_lfilter
View on GitHub
bring low pass filtering to PyTorch!
☆28Sep 3, 2020Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ZhaoZeyu1995 / BenNevis
View on GitHub
A Diffrentiable WFST-based End-to-End Automatic Speech Recognition toollkit with flexible topology support
☆12Feb 15, 2026Updated 5 months ago
wiragotama / TIARA-annotationTool
View on GitHub
An Interactive Tool for Annotating Discourse Structure and Text Improvement
☆16Sep 15, 2021Updated 4 years ago
TIGER-AI-Lab / LLM-AMT
View on GitHub
This repository contains the code for our paper "Augmenting Black-box LLMs with Medical Textbooks for Clinical Question Answering" [EMNLP…
☆14Oct 8, 2024Updated last year
antonio-f / Dynamic-Programming
View on GitHub
Algorithms for Policy Evaluation, Estimation of Action Values, Policy Improvement, Policy Iteration, Truncated Policy Evaluation, Truncat…
☆11Apr 3, 2019Updated 7 years ago
sushant-t / tts-trainer
View on GitHub
Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…
☆30May 27, 2023Updated 3 years ago
Shelton1013 / SwitchLingua
View on GitHub
[NeurIPS 25]SwitchLingua: The First Large-Scale Multilingual and Multi-Ethnic Code-Switching Dataset
☆21Sep 19, 2025Updated 10 months ago
chrismcguire / gobberish
View on GitHub
Generates random utf-8 strings for fuzz t�sting character encoding probl�ms
☆11Aug 21, 2015Updated 10 years ago
frankkramer-lab / GPTNERMED
View on GitHub
GPTNERMED is a language model-generated, synthetic dataset and an open neural NER model for medical entities designed for German data.
☆15Oct 5, 2023Updated 2 years ago
markusdr / transducersaurus
View on GitHub
Automatically exported from code.google.com/p/transducersaurus
☆11Apr 1, 2015Updated 11 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
AIFSH / ComfyUI-StreamV2V
View on GitHub
☆12May 31, 2024Updated 2 years ago
vmanita / Customer-purchase-prediction
View on GitHub
Classification machine learning models to predict the probability of a client accepting a future marketing campaign/product release.
☆17Jul 27, 2020Updated 6 years ago
artbataev / end2end
View on GitHub
Losses and decoders for end-to-end ASR and OCR
☆34Oct 30, 2020Updated 5 years ago
AbhinavUtkarsh / Image-Segmentation
View on GitHub
Image segmentation by KNN Algorithm project Report for subject Digital Image Processing (CS1553). This Project has an analysis of K - Nea…
☆11Aug 20, 2023Updated 2 years ago
RazhanHameed / kurdish-llama
View on GitHub
This is an attempt to fine-tune the Llama model for Central Kurdish.
☆17May 24, 2023Updated 3 years ago
kiang / map.coa.gov.tw
View on GitHub
working with data from map.coa.gov.tw
☆15Feb 26, 2018Updated 8 years ago
HuPER29 / HuPER
View on GitHub
☆16Mar 19, 2026Updated 4 months ago