DongKeon / webrtc-whisper-asrView external linksLinks
WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.
☆13Sep 27, 2024Updated last year
Alternatives and similar repositories for webrtc-whisper-asr
Users that are interested in webrtc-whisper-asr are comparing it to the libraries listed below
Sorting:
- Russian phonetical transcription☆11Nov 19, 2025Updated 2 months ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- 🎵 muse: Music Separation☆11Feb 14, 2024Updated 2 years ago
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆27Sep 20, 2025Updated 4 months ago
- ☆14Aug 16, 2023Updated 2 years ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- Neural model for prediction of stress position in Russian words☆12Jun 22, 2025Updated 7 months ago
- SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech☆11Jun 30, 2023Updated 2 years ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 9 months ago
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆14Jun 28, 2024Updated last year
- Indic-Conformer models for ASR☆20Jul 19, 2024Updated last year
- C++ version of pyannote audio overlapped speech detection pipeline☆13Feb 14, 2024Updated 2 years ago
- ☆13Dec 7, 2022Updated 3 years ago
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"☆21Jun 7, 2025Updated 8 months ago
- ☆14Aug 19, 2024Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆18Jun 12, 2022Updated 3 years ago
- ☆20Mar 7, 2025Updated 11 months ago
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆23Feb 2, 2026Updated 2 weeks ago
- ☆19Jan 8, 2025Updated last year
- [Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…☆22Jan 10, 2025Updated last year
- A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis☆44Jul 24, 2023Updated 2 years ago
- Python implementation of a few speech intelligibility prediction algorithms☆15May 29, 2024Updated last year
- Hebrew grapheme to phoneme (G2P)☆88Feb 1, 2026Updated 2 weeks ago
- ☆21Mar 4, 2024Updated last year
- T5-based (russian) text normalization☆25Jan 25, 2024Updated 2 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆30May 27, 2023Updated 2 years ago
- RVC Onnx Infer- Upgraded and simplified-ish☆25May 9, 2024Updated last year
- ☆24Mar 13, 2020Updated 5 years ago
- Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning …☆27Aug 1, 2023Updated 2 years ago
- **ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degrada…☆24Sep 27, 2022Updated 3 years ago
- Official implementation of the paper "SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transfor…☆24Feb 17, 2023Updated 3 years ago
- A collection of all our phonemeizers for dataset construction and inference☆27Feb 21, 2025Updated 11 months ago
- Данные 6-го издания «Грамматического словаря русского языка» А. А. Зализняка (2010) в виде текстовых файлов☆24Sep 17, 2024Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆33Oct 23, 2025Updated 3 months ago
- This repository contains text-to-speech (TTS) models and utilities designed produce synthetic training datasets for other speech-related …☆28Mar 12, 2023Updated 2 years ago
- SylNet: An Adaptable End-to-End Syllable Count Estimator for Speech☆27May 25, 2023Updated 2 years ago
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆24Dec 20, 2022Updated 3 years ago