A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Vosk Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file
☆11May 5, 2024Updated last year
Alternatives and similar repositories for vosk_autosrt
Users that are interested in vosk_autosrt are comparing it to the libraries listed below
Sorting:
- A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using faster_whisper module which is a reimplementation of OpenAI Wh…☆28May 5, 2024Updated last year
- Translate subtitle file to another language☆11May 5, 2024Updated last year
- ANDROID APP to AUTO GENERATE SUBTITLE FILE and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any audio/vide…☆21May 5, 2024Updated last year
- Generate transcriptions and subtitles using OpenAI whisper as a base model, stable-ts/whisperx as a timestamp stabilizer using ASR models…☆19Mar 10, 2023Updated 2 years ago
- Simple patch applier written in c++ for Capcut (Any version as far as ik)☆12Dec 12, 2025Updated 2 months ago
- ANDROID APP that can RECOGNIZE ANY LIVE AUDIO/VIDEO STREAMING (using free VOSK Speech Recognition API) then TRANSLATE (using unofficial o…☆39May 5, 2024Updated last year
- ☆38Feb 5, 2026Updated last month
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated 10 months ago
- KittenTTS is an ultra-lightweight, CPU-friendly text-to-speech model with 15M params for real-time, high-quality voices. Open source, fas…☆23Updated this week
- ☆11Aug 11, 2023Updated 2 years ago
- Whisper finetuning☆16Apr 9, 2025Updated 11 months ago
- Code for the paper "RIR-in-a-Box : Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation" presented at Interspeech 20…☆16Sep 1, 2024Updated last year
- Russian phonetical transcription☆11Nov 19, 2025Updated 3 months ago
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆14Dec 19, 2022Updated 3 years ago
- Hanime.tv stremio addon☆17Feb 10, 2026Updated 3 weeks ago
- Movie Web with real data film☆11Sep 24, 2022Updated 3 years ago
- Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming☆15Aug 20, 2024Updated last year
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- A Python-based tool for downloading Spotify tracks and albums as MP3 files.☆10Nov 18, 2024Updated last year
- ☆13Oct 9, 2025Updated 5 months ago
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Oct 30, 2024Updated last year
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- A gradio interface for making transcribed and translated subtitles for videos☆42Feb 16, 2025Updated last year
- ☆10May 25, 2021Updated 4 years ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- Deepspeech/Coqui AI speech to text systems in Esperanto. - Parolrekoniloj en Esperanto uzante Deepspeech/Coqui Ai.☆10Jan 11, 2022Updated 4 years ago
- Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems☆13Jan 16, 2025Updated last year
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 4 years ago
- manage playlists for mpv in your Linux terminal☆13Dec 14, 2025Updated 2 months ago
- Target speaker automatic speech recognition (TS-ASR)☆12Oct 14, 2023Updated 2 years ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- [Tiny KWS] SparkNet: Sparse Binarization for Fast Keyword Spotting☆17Aug 26, 2025Updated 6 months ago
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆12Sep 29, 2025Updated 5 months ago
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- ☆26Nov 3, 2025Updated 4 months ago
- IPA Phonetic dataset lexicon☆18Updated this week
- ☆12Apr 26, 2025Updated 10 months ago
- A C++ library for parsing and manipulating JSGF grammar files.☆14Feb 13, 2024Updated 2 years ago
- PyTorch implementation of TinyWASE described in our paper "Compressing Speaker Extraction Model with Ultra-low Precision Quantization and…☆11Jun 28, 2021Updated 4 years ago