Synchronize Whisper's timestamps over an existing accurate transcription
☆163May 28, 2024Updated last year
Alternatives and similar repositories for WhisperTimeSync
Users that are interested in WhisperTimeSync are comparing it to the libraries listed below
Sorting:
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆348Nov 12, 2024Updated last year
- SubER - Subtitle Edit Rate☆24Feb 19, 2026Updated 2 weeks ago
- ☆38Dec 26, 2022Updated 3 years ago
- ☆11Nov 7, 2024Updated last year
- Generate transcriptions and subtitles using OpenAI whisper as a base model, stable-ts/whisperx as a timestamp stabilizer using ASR models…☆19Mar 10, 2023Updated 2 years ago
- Timething is a library for aligning text transcripts with their audio recordings.☆130Dec 3, 2024Updated last year
- Transcription, forced alignment, and audio indexing with OpenAI's Whisper☆2,169Oct 29, 2025Updated 4 months ago
- Streaming Vocos☆30Jun 10, 2025Updated 8 months ago
- Diffusion Model for Voice Conversion☆69Mar 14, 2024Updated last year
- Russian accentuator and IPA transcriber☆16Sep 10, 2024Updated last year
- Multilingual Automatic Speech Recognition with word-level timestamps and confidence☆2,769Sep 9, 2025Updated 5 months ago
- ez audio transcription tool with flexible processing and post-processing options☆163Feb 1, 2024Updated 2 years ago
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆28Mar 14, 2023Updated 2 years ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆413Feb 21, 2024Updated 2 years ago
- ☆32Nov 24, 2024Updated last year
- Code for "Phoneme Segmentation Using Self-Supervised Speech Models", Strgar & Harwath, Proceedings of the IEEE Spoken Language Technology…☆55Nov 4, 2022Updated 3 years ago
- ☆18Nov 8, 2022Updated 3 years ago
- Overlapped Speech detection in Multi-party Conversations☆22Feb 20, 2018Updated 8 years ago
- A testing repo to share code and thoughts on diarisation☆57Mar 26, 2024Updated last year
- Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming☆15Aug 20, 2024Updated last year
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols☆17Nov 19, 2025Updated 3 months ago
- generate granular word-level captions in srt format☆57Sep 26, 2022Updated 3 years ago
- Lightweight Speech Representation Learning for One-Shot Voice Conversion☆24Dec 12, 2024Updated last year
- ☆54Jul 16, 2025Updated 7 months ago
- RWKV-SpeechChat is a real-time dialogue script based on a frozen 3B RWKV model with trained adapters and initial states. Various trained …☆28Jan 1, 2025Updated last year
- This is a winter of code project aimed at speech enhancement of text to speech models.☆24Feb 6, 2022Updated 4 years ago
- PyTorch implementation of "Nextformer: A ConvNeXt Augmented Conformer For End-To-End Speech Recognition"☆11Dec 15, 2022Updated 3 years ago
- ☆33Nov 18, 2025Updated 3 months ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 3 months ago
- A lightweight muji-moe chatbot created by Reecho.ai.☆13Oct 1, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.☆12Mar 15, 2025Updated 11 months ago
- Use a video and cut out portions of it without re-mounting the video inbetween.☆16Sep 23, 2024Updated last year
- Generative Adversarial Networks for different impaired speech conversions☆39Jul 6, 2023Updated 2 years ago
- Open Source AI Benchmarking toolkit for benchmarking speech to text services☆58Apr 17, 2024Updated last year
- silero-vad pytorch implement☆36Nov 23, 2024Updated last year
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆15Mar 26, 2022Updated 3 years ago