FamousDirector / FastWhisper
This is an optimized implementation of OpenAI's Whisper for multilingual transcription.
☆38Updated 2 years ago
Alternatives and similar repositories for FastWhisper:
Users that are interested in FastWhisper are comparing it to the libraries listed below
- ☆38Updated 3 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆90Updated 3 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆107Updated last year
- Putting flows on top of neural transducers for better TTS☆63Updated this week
- Use quantized versions of Whisper to speed up inference☆12Updated 3 months ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆146Updated 2 weeks ago
- Whisper combined with Silero VAD, for improved long-form transcriptions☆45Updated 2 years ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆20Updated 5 months ago
- A collection of utilities for handling IPA phones.☆26Updated last year
- Whisper fine-tuning event script to use multiple hf datasets☆32Updated 2 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 11 months ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- Collection of scripts from mHuBERT-147.☆24Updated 2 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 7 months ago
- Unofficial implementation of wavenext vocoder☆40Updated 5 months ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆135Updated last year
- Package for inference for punctuation, true-casing, and sentence boundary detection☆24Updated 7 months ago
- Finetuning VITS Efficiently☆32Updated last year
- Official implementation of the TTS model Lina-Speech☆150Updated 3 weeks ago
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text☆36Updated 4 years ago
- ☆34Updated 4 months ago
- ☆70Updated last year
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆52Updated 3 years ago
- This is the M-AILABS Speech Dataset☆38Updated 2 months ago
- Simple voice activity detection (VAD) algorithm in Python☆12Updated last year
- 56 language, 1 model Multilingual ASR☆24Updated 3 years ago
- SubER - Subtitle Edit Rate☆22Updated 5 months ago
- ☆65Updated 2 months ago
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆56Updated 2 weeks ago
- Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to…☆31Updated 4 years ago