FamousDirector / FastWhisperLinks
This is an optimized implementation of OpenAI's Whisper for multilingual transcription.
☆39Updated 3 years ago
Alternatives and similar repositories for FastWhisper
Users that are interested in FastWhisper are comparing it to the libraries listed below
Sorting:
- ☆40Updated 3 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆103Updated last year
- Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.☆78Updated last month
- Putting flows on top of neural transducers for better TTS☆64Updated last week
- Whisper fine-tuning event script to use multiple hf datasets☆32Updated 2 years ago
- Various speech datasets made available to the public☆129Updated last year
- ☆156Updated this week
- Fine-Tune Whisper with Transformers and PEFT☆58Updated 2 years ago
- ☆37Updated 3 weeks ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆181Updated last week
- Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recogni…☆23Updated 4 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆151Updated last year
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆76Updated 4 years ago
- Convert English text from written expressions into spoken forms☆27Updated 3 years ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated 2 years ago
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆140Updated 6 months ago
- A python package for deep multilingual punctuation prediction.☆151Updated last year
- Open TTS models, built for streaming on the edge☆44Updated 9 months ago
- Neural HMMs are all you need (for high-quality attention-free TTS)☆163Updated last week
- python bindings for symphonia/opus - read various audio formats from python and write opus files☆70Updated 4 months ago
- Official implementation of the TTS model Lina-Speech☆175Updated 11 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆56Updated 6 months ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆80Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆107Updated last week
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆178Updated last year
- ☆56Updated 2 years ago
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text☆34Updated 5 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆153Updated last year
- A high-quality, varied ~30hr voice dataset suitable for training a TTS model☆63Updated 2 years ago