gongouveia / Whisper-Synthetic-ASR-Dataset-Generator
This UI serves as a Synthetic ASR Dataset Generator powered by/for OpenAI Whisper, enabling users to capture audio, transcribing it, on the fly and manage the generated dataset š¤. Fine tune Whisper or enhanced and custom datasets
ā28Updated 2 months ago
Alternatives and similar repositories for Whisper-Synthetic-ASR-Dataset-Generator:
Users that are interested in Whisper-Synthetic-ASR-Dataset-Generator are comparing it to the libraries listed below
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesā92Updated 9 months ago
- On-device speaker diarization powered by deep learningā38Updated last week
- š¼ Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decompositionā15Updated 11 months ago
- Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.ā35Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.ā80Updated last year
- Speaker change detection using SincNet and an LSTM/Transformerā46Updated 7 months ago
- Create training data for training a voice cloner for bark text to speech.ā43Updated last year
- C++ version of pyannote audio overlapped speech detection pipelineā11Updated last year
- ā29Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisperā107Updated 2 years ago
- Use quantized versions of Whisper to speed up inferenceā12Updated 4 months ago
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.ā38Updated 3 years ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.ioā67Updated last year
- [Batching/MultiGPU/DataLoader Implemented] Code for the paper Hybrid Spectrogram and Waveform Source Separationā22Updated last year
- A testing repo to share code and thoughts on diarisation