parambharat / whisper-finetuning
Repository contains code to fine-tune WhisperASR model
☆23Updated 2 years ago
Alternatives and similar repositories for whisper-finetuning:
Users that are interested in whisper-finetuning are comparing it to the libraries listed below
- ☆63Updated last month
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆139Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆135Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆77Updated last year
- ☆266Updated 7 months ago
- ☆38Updated 3 years ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆53Updated last month
- Zero-shot Audio Classification using Whisper☆77Updated 2 years ago
- A TTS model that makes a speaker speak new languages☆75Updated 7 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 6 months ago
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆34Updated last year
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆147Updated last year
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Updated last year
- Whisper fine-tuning event script to use multiple hf datasets☆32Updated 2 years ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆15Updated 2 months ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆102Updated last month
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆144Updated 8 months ago
- ☆62Updated 5 months ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆90Updated 3 months ago
- EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction☆244Updated 8 months ago
- Triton backend for https://github.com/OpenNMT/CTranslate2☆34Updated last year
- Whisper finetuned on VinBigdata-VLSP2020-100h + KenLM☆36Updated last year
- Speaker Diarization with Transformers☆61Updated 7 months ago
- ☆84Updated 9 months ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆65Updated 2 months ago
- Update ASR paper everyday☆105Updated this week
- An unofficial PyTorch implementation of VALL-E☆87Updated this week
- VoiceBox neural network implementation☆100Updated 5 months ago
- ☆348Updated 10 months ago