Anoncheg1 / stable-ts-conView external linksLinks
Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.
☆24Dec 20, 2022Updated 3 years ago
Alternatives and similar repositories for stable-ts-con
Users that are interested in stable-ts-con are comparing it to the libraries listed below
Sorting:
- ☆19Nov 4, 2022Updated 3 years ago
- Russian phonetical transcription☆11Nov 19, 2025Updated 2 months ago
- 🎵 muse: Music Separation☆11Feb 14, 2024Updated 2 years ago
- Neural model for prediction of stress position in Russian words☆12Jun 22, 2025Updated 7 months ago
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆28Sep 20, 2025Updated 4 months ago
- WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.☆13Sep 27, 2024Updated last year
- C++ version of pyannote audio overlapped speech detection pipeline☆13Feb 14, 2024Updated 2 years ago
- ☆13Dec 7, 2022Updated 3 years ago
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"☆20Jun 7, 2025Updated 8 months ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆14Nov 15, 2025Updated 2 months ago
- ☆20Mar 7, 2025Updated 11 months ago
- ☆19Jan 8, 2025Updated last year
- [Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…☆22Jan 10, 2025Updated last year
- أسئلة باللغة العربية تركز على الثقافة السعودية تم اختبارها على عدد من النماذج اللغوية الضخمة LLMs☆17Jan 22, 2025Updated last year
- T5-based (russian) text normalization☆25Jan 25, 2024Updated 2 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆30May 27, 2023Updated 2 years ago
- ☆24Mar 13, 2020Updated 5 years ago
- Speaker diarization service☆26Feb 2, 2026Updated last week
- A free & open tool for transcribing audio interviews with offline ASR support☆25Dec 21, 2023Updated 2 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆33Oct 23, 2025Updated 3 months ago
- A repository of Japanese Phoneme-Level BERT☆22Dec 16, 2023Updated 2 years ago
- A collection of all our phonemeizers for dataset construction and inference☆27Feb 21, 2025Updated 11 months ago
- Данные 6-го издания «Грамматического словаря русского языка» А. А. Зализняка (2010) в виде текстовых файлов☆24Sep 17, 2024Updated last year
- ☆56Dec 19, 2022Updated 3 years ago
- A Unity implementation of DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on …☆25Sep 22, 2022Updated 3 years ago
- ☆28Nov 15, 2023Updated 2 years ago
- Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization☆25Oct 29, 2022Updated 3 years ago
- ⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.☆35Jan 19, 2024Updated 2 years ago
- Normalize Text in Russian☆28Nov 7, 2023Updated 2 years ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆66Aug 13, 2024Updated last year
- Robust Speech Recognition via Large-Scale Weak Supervision☆29Dec 16, 2023Updated 2 years ago
- MSP-Podcast Challenge Baseline Code☆30Jun 12, 2024Updated last year
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆30Aug 2, 2025Updated 6 months ago
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Oct 10, 2023Updated 2 years ago
- Baselines for IS25 Source Tracing Special Session☆33Jan 3, 2025Updated last year
- InSales e-commerce platform API bindings☆14Jul 13, 2024Updated last year
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- UDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.☆29May 14, 2025Updated 9 months ago