A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models
☆67Sep 26, 2022Updated 3 years ago
Alternatives and similar repositories for OpenAI_Whisper_ASR
Users that are interested in OpenAI_Whisper_ASR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Jun 5, 2020Updated 5 years ago
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- ☆26Sep 22, 2022Updated 3 years ago
- Synthesized singing voice demos of WeSinger 2 paper.☆26Feb 20, 2023Updated 3 years ago
- This is the implementation of the manuscript "Learning General All-Neural Speech Enhancement based on Taylor's Approximation Theory", whi…☆14Nov 25, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 4 months ago
- A free & open tool for transcribing audio interviews with offline ASR support☆25Dec 21, 2023Updated 2 years ago
- Rich Prosody Diversity Modelling with Phone-level Mixture Density Network☆45Dec 1, 2021Updated 4 years ago
- ☆46Apr 16, 2023Updated 2 years ago
- Production first, nn-based on-device signal processing toolkit.☆65May 30, 2023Updated 2 years ago
- 单独维护的中文TTS☆34Oct 28, 2022Updated 3 years ago
- ☆33Nov 29, 2022Updated 3 years ago
- ☆68Jul 16, 2023Updated 2 years ago
- Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.☆30Sep 16, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- ☆15Apr 2, 2025Updated 11 months ago
- Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectr…☆17Apr 27, 2023Updated 2 years ago
- ☆23Oct 17, 2024Updated last year
- Official implementation of the source-filter HiFiGAN vocoder☆270Jul 29, 2023Updated 2 years ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Nov 10, 2023Updated 2 years ago
- A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis☆44Jul 24, 2023Updated 2 years ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- Ultrafast GAN based Vocoder for Text to Speech☆50Jul 16, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆16Jun 13, 2022Updated 3 years ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆68Jan 5, 2026Updated 2 months ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆80Sep 27, 2023Updated 2 years ago
- Java Bindings for the C++ library DeepSpeech☆10Jun 4, 2020Updated 5 years ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- A repository for benchmarking neural vocoders by their quality and speed.☆211May 30, 2025Updated 10 months ago
- ☆16Feb 19, 2026Updated last month
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- multilingual speech aligner☆76Nov 19, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Implementation of the AlignTTS☆77Jul 6, 2023Updated 2 years ago
- ☆17Aug 27, 2025Updated 7 months ago
- Room impulse response simulation for various array architectures using Monte-Carlo simulation and quaternions (Python)☆17Feb 25, 2026Updated last month
- TransferTTS (Zero-Shot learning of VITS)☆101Sep 23, 2022Updated 3 years ago
- ☆46Nov 2, 2023Updated 2 years ago
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆33Jun 14, 2024Updated last year
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago