jlvdoorn / WhisperATCLinks
Applying Large-Scale Weakly-Supervised Automatic Speech Recognition to Air Traffic Control
☆32Updated last year
Alternatives and similar repositories for WhisperATC
Users that are interested in WhisperATC are comparing it to the libraries listed below
Sorting:
- A Corpus for Research on Robust Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications☆65Updated 2 years ago
- ☆38Updated 11 months ago
- This repository includes training, inference, evaluation, and utility scripts developed for fine-tuning the Whisper medium.en model on Ai…☆11Updated 8 months ago
- ATC-Anno is an annotation tool for Air Traffic Control data that offers automatic semantic and concept annotation.☆11Updated last year
- ☆16Updated 2 years ago
- ☆40Updated last year
- Implementation of the paper "Confidence estimation for attention based sequence to sequence models for speech recognition"☆16Updated 4 years ago
- Official implementation of "PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords" (INTERSPEECH 2023)☆48Updated last year
- An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.☆69Updated 2 years ago
- unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"☆15Updated last year
- Discriminative Training of VBx Diarization☆25Updated 8 months ago
- A semi-supervised sequence-to-sequence ASR☆10Updated 2 years ago
- Official implementation of the paper "Speech Intelligibility Assessment of Dysarthric Speech by using Goodness of Pronunciation with Unce…☆23Updated 3 months ago
- Simple diarization model☆49Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆83Updated last year
- ☆11Updated last year
- The implementation of "End-to-End Neural Speaker Diarization with an Iterative Adaptive Attractor Estimation", which is accepted by Neura…☆13Updated last year
- Code for the Interspeech 2024 paper "MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting"☆31Updated last month
- This is a repository of neural full-rank spatial covariance analysis with speaker activity (neural FCASA).☆32Updated 3 months ago
- ☆53Updated last year
- FNSE-SBGAN: Far-field Speech Enhancement with Schrödinger Bridge and Generative Adversarial Networks☆11Updated last month
- ☆49Updated 4 years ago
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆22Updated 6 months ago
- Deep model with built-in self-attention alignment for acoustic echo cancellation, Pytorch implement☆38Updated last year
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Updated 4 months ago
- ☆14Updated last year
- ☆34Updated last year
- This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).☆28Updated 5 months ago
- ☆26Updated last year
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆12Updated 4 months ago