A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Vosk Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file
☆11May 5, 2024Updated last year
Alternatives and similar repositories for vosk_autosrt
Users that are interested in vosk_autosrt are comparing it to the libraries listed below
Sorting:
- A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using faster_whisper module which is a reimplementation of OpenAI Wh…☆28May 5, 2024Updated last year
- Translate subtitle file to another language☆11May 5, 2024Updated last year
- ANDROID APP to AUTO GENERATE SUBTITLE FILE and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any audio/vide…☆21May 5, 2024Updated last year
- Generate transcriptions and subtitles using OpenAI whisper as a base model, stable-ts/whisperx as a timestamp stabilizer using ASR models…☆19Mar 10, 2023Updated 3 years ago
- Simple patch applier written in c++ for Capcut (Any version as far as ik)☆12Dec 12, 2025Updated 2 months ago
- ANDROID APP that can RECOGNIZE ANY LIVE AUDIO/VIDEO STREAMING (using free VOSK Speech Recognition API) then TRANSLATE (using unofficial o…☆39May 5, 2024Updated last year
- ☆38Feb 5, 2026Updated last month
- ☆13Oct 9, 2025Updated 5 months ago
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Oct 30, 2024Updated last year
- Movie Web with real data film☆11Sep 24, 2022Updated 3 years ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆14Dec 19, 2022Updated 3 years ago
- Code for the paper "RIR-in-a-Box : Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation" presented at Interspeech 20…☆16Sep 1, 2024Updated last year
- A Python-based tool for downloading Spotify tracks and albums as MP3 files.☆10Nov 18, 2024Updated last year
- KittenTTS is an ultra-lightweight, CPU-friendly text-to-speech model with 15M params for real-time, high-quality voices. Open source, fas…☆24Updated this week
- Hanime.tv stremio addon☆17Feb 10, 2026Updated last month
- Whisper finetuning☆16Apr 9, 2025Updated 11 months ago
- ☆11Aug 11, 2023Updated 2 years ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated 11 months ago
- Russian phonetical transcription☆11Nov 19, 2025Updated 3 months ago
- Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming☆15Aug 20, 2024Updated last year
- A gradio interface for making transcribed and translated subtitles for videos☆42Feb 16, 2025Updated last year
- Example python scripts to evaluate various ASR methods☆11Dec 22, 2021Updated 4 years ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆10Sep 30, 2024Updated last year
- PyTorch implementation of TinyWASE described in our paper "Compressing Speaker Extraction Model with Ultra-low Precision Quantization and…☆11Jun 28, 2021Updated 4 years ago
- All-in-one Speech Transcription☆10Jan 25, 2026Updated last month
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- My public domain speech index☆13Sep 19, 2019Updated 6 years ago
- [ICASSP 2023] Tempo vs. Pitch: understanding self-supervised tempo estimation☆13Aug 2, 2023Updated 2 years ago
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- ☆11Nov 5, 2021Updated 4 years ago
- Docker for building an environment for Dutch online and offline ASR.☆12Feb 2, 2021Updated 5 years ago
- Widevine L3 CDM Give away☆14Jan 12, 2023Updated 3 years ago
- AsoSoft Speech Corpus can be used for spoken language processing tasks in Central Kurdish such as speech recognition, speaker recognition…☆10Mar 8, 2022Updated 4 years ago
- Truns IMDB watchlist into a YTS torrent rss feed☆12Oct 16, 2017Updated 8 years ago
- ☆15Jul 14, 2020Updated 5 years ago