luigisaetta / whisper-app
This repository contains all the work I have done (and I'm doing) in developing a web app for speech-to-text, based on OpenAI Whisper
☆9Updated last year
Related projects ⓘ
Alternatives and complementary repositories for whisper-app
- ☆54Updated this week
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 9 months ago
- Tunable pipelines☆30Updated last month
- Whisper fine-tuning event script to use multiple hf datasets☆32Updated last year
- Finetune VITS and MMS using HuggingFace's tools☆122Updated 7 months ago
- ☆38Updated 2 years ago
- Repository contains code to fine-tune WhisperASR model☆23Updated last year
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆110Updated 2 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆134Updated 10 months ago
- Create an LJSpeech structured voice dataset on wave input☆21Updated last month
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆157Updated 8 months ago
- ☆56Updated last year
- ☆257Updated 5 months ago
- Collection of Open Source Speech Data☆146Updated 2 weeks ago
- Various speech datasets made available to the public☆99Updated last month
- Use quantized versions of Whisper to speed up inference☆11Updated last month
- Zero-shot Audio Classification using Whisper☆74Updated last year
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆71Updated last year
- ☆41Updated last year
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆34Updated last year
- ☆66Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated 2 weeks ago
- ☆40Updated last year
- Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.☆65Updated last week
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆26Updated last year
- Finetuning Whisper ASR model for Belarusian language☆14Updated last year
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆259Updated last year