prateekralhan / OpenAI_Whisper_ASR
A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models
☆66Updated 2 years ago
Alternatives and similar repositories for OpenAI_Whisper_ASR:
Users that are interested in OpenAI_Whisper_ASR are comparing it to the libraries listed below
- Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with the…☆47Updated 2 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆50Updated 10 months ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆136Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated 3 weeks ago
- ☆56Updated 2 years ago
- Official Code for ParrotTTS☆50Updated 6 months ago
- ☆20Updated 2 years ago
- Zero-shot Audio Classification using Whisper☆80Updated 2 years ago
- A simple voice conversion tool☆17Updated 3 years ago
- Simple PyTorch Denoisers for Waveform Audio☆35Updated last week
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated last year
- Finetuning VITS Efficiently☆32Updated last year
- Uses machine learning to denoise audio containing speech☆33Updated 10 months ago
- A curated list of awesome voice activity detection☆50Updated 5 months ago
- Putting flows on top of neural transducers for better TTS☆62Updated last month
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- Adaptive Vocoder for Custom Voice☆59Updated 2 years ago
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆31Updated 2 years ago
- VITS-based zero-shot TTS system varying with diverse style/speaker conditioning methods.☆36Updated 2 years ago
- Zero-Shot Foreign Accent Conversion without a Native Reference☆33Updated last year
- An unofficial PyTorch implementation of VALL-E☆87Updated 2 weeks ago
- An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"☆28Updated last year
- ☆23Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆112Updated 2 years ago
- Toolbox for easy and qualitative one-shot voice conversion☆45Updated 3 years ago
- C++ version of pyannote audio overlapped speech detection pipeline☆13Updated last year
- Google's SoundStorm: Efficient Parallel Audio Generation☆132Updated last year
- 4G GPU & 10 Minutes for train☆12Updated last year
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated 11 months ago
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text☆36Updated 4 years ago