A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models
☆67Sep 26, 2022Updated 3 years ago
Alternatives and similar repositories for OpenAI_Whisper_ASR
Users that are interested in OpenAI_Whisper_ASR are comparing it to the libraries listed below
Sorting:
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Jun 5, 2020Updated 5 years ago
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- ☆26Sep 22, 2022Updated 3 years ago
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 3 months ago
- Synthesized singing voice demos of WeSinger 2 paper.☆26Feb 20, 2023Updated 3 years ago
- ☆46Apr 16, 2023Updated 2 years ago
- ☆15Apr 2, 2025Updated 11 months ago
- This is the implementation of the manuscript "Learning General All-Neural Speech Enhancement based on Taylor's Approximation Theory", whi…☆14Nov 25, 2022Updated 3 years ago
- Rich Prosody Diversity Modelling with Phone-level Mixture Density Network☆45Dec 1, 2021Updated 4 years ago
- Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectr…☆17Apr 27, 2023Updated 2 years ago
- Production first, nn-based on-device signal processing toolkit.☆65May 30, 2023Updated 2 years ago
- ☆68Jul 16, 2023Updated 2 years ago
- This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.☆35Oct 11, 2025Updated 4 months ago
- ☆33Nov 29, 2022Updated 3 years ago
- 单独维护的中文TTS☆34Oct 28, 2022Updated 3 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 8 months ago
- ☆16Jun 13, 2022Updated 3 years ago
- Implementation of the AlignTTS☆77Jul 6, 2023Updated 2 years ago
- ☆17Aug 27, 2025Updated 6 months ago
- ☆23Oct 17, 2024Updated last year
- ☆24Mar 13, 2020Updated 5 years ago
- Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.☆30Sep 16, 2022Updated 3 years ago
- A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis☆44Jul 24, 2023Updated 2 years ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Nov 10, 2023Updated 2 years ago
- Official implementation of the source-filter HiFiGAN vocoder☆268Jul 29, 2023Updated 2 years ago
- A free & open tool for transcribing audio interviews with offline ASR support☆25Dec 21, 2023Updated 2 years ago
- A repository for benchmarking neural vocoders by their quality and speed.☆211May 30, 2025Updated 9 months ago
- ☆46Nov 2, 2023Updated 2 years ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆68Jan 5, 2026Updated 2 months ago
- ☆33Nov 27, 2021Updated 4 years ago
- ☆26Jun 5, 2024Updated last year
- Transcribing Speech with Multinomial Diffusion, training code and models.☆80Sep 27, 2023Updated 2 years ago
- ☆117Feb 26, 2026Updated last week
- The implementation of TaylorBeamformer, which is in submission to Interspeech2022☆48Jun 10, 2022Updated 3 years ago
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago