miguelvalente / whispererLinks
Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.
ā137Updated 2 years ago
Alternatives and similar repositories for whisperer
Users that are interested in whisperer are comparing it to the libraries listed below
Sorting:
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeā150Updated last year
- š¤ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationā257Updated 2 years ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.ā68Updated 3 weeks ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesā97Updated last year
- Google's SoundStorm: Efficient Parallel Audio Generationā132Updated 2 years ago
- ā359Updated last year
- Performant and accurate speech recognition built on Pytorchā253Updated 3 years ago
- Zero-shot Audio Classification using Whisperā79Updated 2 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translationā149Updated last year
- DLAS - A configuration-driven trainer for generative modelsā142Updated 2 years ago
- ā262Updated last year
- Speaker Diarization with Transformersā69Updated 3 months ago
- Your one-stop solution for voice dataset creationā124Updated last year
- š Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. š§š„š Advanced audio processing.ā252Updated last year
- Faster Tortoise inference then Tortoise Fast Forkā128Updated last year
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS modelsā163Updated last year
- A testing repo to share code and thoughts on diarisationā56Updated last year
- ā274Updated last year
- [WIP] VoiceSmith makes training text to speech models easy.ā226Updated 2 years ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event ā¦ā404Updated last year
- Whisper combined with Silero VAD, for improved long-form transcriptionsā53Updated 2 years ago
- TorToiSe fine-tuning with DLASā225Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPā¦ā103Updated 11 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisperā118Updated 2 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.ā82Updated 2 years ago
- Official Implementation of StyleTTSā452Updated 8 months ago
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorchā274Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) databaseā107Updated 2 weeks ago
- ā378Updated last year
- Barkify: an unoffical training implementation of Bark TTS by suno-aiā127Updated 2 years ago