jack-tol/fine-tuning-whisper-on-atc-data

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jack-tol/fine-tuning-whisper-on-atc-data)

jack-tol / fine-tuning-whisper-on-atc-data

This repository includes training, inference, evaluation, and utility scripts developed for fine-tuning the Whisper medium.en model on Air Traffic Control (ATC) data.

☆29

Alternatives and similar repositories for fine-tuning-whisper-on-atc-data

Users that are interested in fine-tuning-whisper-on-atc-data are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jlvdoorn / WhisperATC
View on GitHub
Applying Large-Scale Weakly-Supervised Automatic Speech Recognition to Air Traffic Control
☆45Nov 29, 2023Updated 2 years ago
lucadellalib / discrete-wavlm-codec
View on GitHub
A neural speech codec based on discrete WavLM representations
☆26Aug 28, 2024Updated last year
egorsmkv / optimized-whisper
View on GitHub
Use quantized versions of Whisper to speed up inference
☆12Oct 16, 2024Updated last year
alxmamaev / ultimate_tts
View on GitHub
☆13Aug 7, 2021Updated 4 years ago
Ranjan-Shettigar / Skin-Cancer-Detection-Classification
View on GitHub
Skin cancer classification project using deep learning techniques for automated diagnosis of skin lesions.
☆11Jun 2, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ICASSP2021-tutorial9 / Distant_conversational_ASR_and_analysis
View on GitHub
☆12Jun 10, 2021Updated 5 years ago
William1617 / REAL_TIME_NKF_AEC
View on GitHub
☆24Jul 29, 2024Updated last year
Honee-W / CPTNN
View on GitHub
unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"
☆15Nov 14, 2023Updated 2 years ago
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
jhuang448 / MultilingualALT
View on GitHub
Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated 2 years ago
BUTSpeechFIT / DeCRED
View on GitHub
☆18Aug 13, 2025Updated 11 months ago
ZhaoF-i / SDAEC
View on GitHub
☆19Jan 6, 2025Updated last year
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Apr 14, 2026Updated 3 months ago
llm-lab-org / CLASP
View on GitHub
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
☆13Jun 27, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
chaufanglin / Normal2Whisper
View on GitHub
Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"
☆14Oct 31, 2024Updated last year
ineffab1e-vista / s4-dynamic-range-compressor
View on GitHub
On-going VA modeling research. Modeling dynamic range compressor using S4.
☆19Nov 29, 2025Updated 7 months ago
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
anas-rz / specmix-pytorch
View on GitHub
A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features
☆10Oct 5, 2022Updated 3 years ago
daanzu / wav2vec2_stt_python
View on GitHub
Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recogni…
☆23Aug 16, 2021Updated 4 years ago
rmarcacini / ser-coraa-pt-br
View on GitHub
Emotion Recognition from Brazilian Portuguese Informal Spontaneous Speech
☆22Mar 21, 2022Updated 4 years ago
vadimkantorov / convasr
View on GitHub
Baseline convolutional ASR system in PyTorch
☆21Nov 16, 2023Updated 2 years ago
wandaweb / Fooocus-Kaggle
View on GitHub
Kaggle notebook for Fooocus
☆11Jun 16, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
aqtq314 / VogenSVS
View on GitHub
☆15Apr 16, 2026Updated 3 months ago
mohan696matlab / whisper-finetuning-youtube-serise
View on GitHub
☆16May 14, 2025Updated last year
Martossien / transcria
View on GitHub
Self-hosted meeting transcription portal — speech-to-text, speaker diarization, LLM-corrected transcripts, structured summaries and Word …
☆24Updated this week
michaelmorr82 / Machine-Learning-Coursera-Andrew-Ng
View on GitHub
Matlaba and Python Solutions on machine learnign coursera on Coursera by Andrew Ng
☆11Jun 23, 2018Updated 8 years ago
kamilakesbi / DiarizersLM
View on GitHub
☆15Jul 16, 2024Updated 2 years ago
JacobLinCool / MPSENet
View on GitHub
Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.
☆22Nov 1, 2024Updated last year
kurianbenoy / whisper_normalizer
View on GitHub
A python package for whisper normalizer
☆79Updated this week
audiolabs / PESQ
View on GitHub
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band) - including P.862 Corrigendum 2 (03/…
☆23May 27, 2025Updated last year
NTIA / alignnet
View on GitHub
Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.
☆18Aug 1, 2025Updated 11 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
subhasisj / FastAPI-Streamlit-Docker-NLP
View on GitHub
Text Classification model deployment using FastAPI, Streamlit and Docker Compose
☆15Feb 12, 2021Updated 5 years ago
nDmitry / ogimgd
View on GitHub
Social previews generator as a microservice.
☆12Apr 9, 2022Updated 4 years ago
kgnlp / allophant
View on GitHub
A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.
☆30Mar 14, 2025Updated last year
BUTSpeechFIT / mt-asr-data-prep
View on GitHub
☆25Feb 26, 2026Updated 4 months ago
jwr1995 / PubSep
View on GitHub
Repository of published DNN speech separation recipes for a number of datasets
☆13Jan 22, 2024Updated 2 years ago
WangHelin1997 / DuTa-VC
View on GitHub
Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…
☆38Dec 5, 2023Updated 2 years ago
tamaraabuhawileh / Skin-Cancer-Object-Detection-YOLO
View on GitHub
Skin Cancer Object Detection-YOLOv5-YOLOv8
☆17Jun 5, 2024Updated 2 years ago