Anoncheg1/stable-ts-con

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Anoncheg1/stable-ts-con)

Anoncheg1 / stable-ts-con

Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.

☆25

Alternatives and similar repositories for stable-ts-con

Users that are interested in stable-ts-con are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

qlemaire22 / real-time-audio-analysis
View on GitHub
Real-time audio analysis with Keras for Speech and Music Detection.
☆21Nov 15, 2018Updated 7 years ago
bjnortier / whisper-tflite-ios
View on GitHub
☆19Nov 4, 2022Updated 3 years ago
nlml / bpm2
View on GitHub
☆11Jan 2, 2020Updated 6 years ago
DongKeon / webrtc-whisper-asr
View on GitHub
WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.
☆13Sep 27, 2024Updated last year
deepvk / muse
View on GitHub
🎵 muse: Music Separation
☆11Feb 14, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ynop / py-ctc-decode
View on GitHub
CTC Decoder implementation with python only. Also supports language model decoding using KenLM.
☆37May 3, 2024Updated 2 years ago
suralmasha / RuTranscript
View on GitHub
Russian phonetical transcription
☆11May 20, 2026Updated 2 months ago
rendchevi / daisy-tts
View on GitHub
🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition
☆14Nov 15, 2025Updated 8 months ago
Koziev / StressModel
View on GitHub
Neural model for prediction of stress position in Russian words
☆13Jun 22, 2025Updated last year
hslh / pie-detection
View on GitHub
Automatic Detection of Potentially Idiomatic Expressions
☆12Feb 19, 2021Updated 5 years ago
anqorithm / Saudi-CERT-API
View on GitHub
This repository has a tool and an API for Saudi CERT alerts. Its goal is to help improve the level of cybersecurity awareness in Saudi Ar…
☆13Nov 16, 2023Updated 2 years ago
kyamauchi1023 / PL-BERT-ja
View on GitHub
A repository of Japanese Phoneme-Level BERT
☆24Dec 16, 2023Updated 2 years ago
ronggong / MIREX-2018-Automatic-Lyrics-to-Audio-Alignment
View on GitHub
Util code, issues, discussions
☆29Aug 31, 2018Updated 7 years ago
AIRI-Institute / AI4TALK
View on GitHub
☆13Dec 7, 2022Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
MZehren / M-DJCUE
View on GitHub
A manually annotated dataset of cue points
☆15Nov 5, 2019Updated 6 years ago
ml-for-speech / speechtoolkit
View on GitHub
[Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…
☆21Jan 10, 2025Updated last year
pashanitw / W2V2-BERT-ASR-Training
View on GitHub
☆15Mar 25, 2024Updated 2 years ago
tsob / cnn-music-structure
View on GitHub
Music structure segmentation with convnets
☆13Mar 11, 2016Updated 10 years ago
CoEDL / vad-sli-asr
View on GitHub
A pipeline to isolate and transcribe one language in mixed-language speech
☆20Oct 25, 2022Updated 3 years ago
mtkresearch / clairaudience
View on GitHub
Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)
☆26Oct 10, 2023Updated 2 years ago
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
c4dm / nnls-chroma
View on GitHub
A Vamp plugin library for harmony and chord extraction.
☆15Apr 24, 2020Updated 6 years ago
Aratako / CALM-DACVAE
View on GitHub
An attempt to reproduce CALM (Continuous Audio Language Models) using DACVAE as the audio VAE.
☆18Feb 20, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
emirdemirel / ASA_ICASSP2021
View on GitHub
A duration-invariant audio-to-lyrics alignment pipeline with low memory footprint which segments long music recordings via a recursive bi…
☆15Oct 13, 2022Updated 3 years ago
leohuang2013 / pyannote-audio_overlapped-speech-detection_cpp
View on GitHub
C++ version of pyannote audio overlapped speech detection pipeline
☆13Feb 14, 2024Updated 2 years ago
thu-spmi / CTC-TTS
View on GitHub
Code for CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment, Interspeech 2026.
☆20Jun 9, 2026Updated last month
aispeech-lab / w2v-cif-bert
View on GitHub
☆37Jun 28, 2021Updated 5 years ago
avishaiElmakies / unsupervised_speech_segmentation_using_slm
View on GitHub
☆20Jan 8, 2025Updated last year
emonosuke / emoASR
View on GitHub
End-to-end MOdeling of ASR (Automatic Speech Recognition)
☆33Feb 16, 2023Updated 3 years ago
Warblefly / TrackBoundaries
View on GitHub
Automatically detects many audio parameters, and notates a liquidaudio playlist accordingly
☆19Apr 18, 2026Updated 3 months ago
speechpro / mixup
View on GitHub
☆24Mar 13, 2020Updated 6 years ago
lqtrung1998 / mwp_cot_design
View on GitHub
☆14Oct 11, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
msplabresearch / MSP-Podcast_Challenge
View on GitHub
MSP-Podcast Challenge Baseline Code
☆31Jun 12, 2024Updated 2 years ago
AmirmohammadRostami / ASV-anti-spoofing-with-EABN
View on GitHub
☆15Feb 25, 2023Updated 3 years ago
fakerybakery / utmos
View on GitHub
A toolkit to calculate speech audio quality. Not affiliated with the original authors
☆74Aug 13, 2024Updated last year
ProjectEGU / whisper-for-low-vram
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆29Dec 16, 2023Updated 2 years ago
interscript / rababa
View on GitHub
Rababa, the diacritization library for Arabic and Hebrew (Abjad scripts in general)
☆13May 1, 2025Updated last year
voidful / SpeechMix
View on GitHub
Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together
☆46Jul 3, 2025Updated last year
collabora / whisper-finetuning
View on GitHub
Whisper finetuning
☆17Apr 9, 2025Updated last year