groundcat / Google-AI-video-transcribe-subtitle-generator
Transcribes video using GCP speech-to-text and generates .SRT subtitles
☆16Updated last year
Alternatives and similar repositories for Google-AI-video-transcribe-subtitle-generator:
Users that are interested in Google-AI-video-transcribe-subtitle-generator are comparing it to the libraries listed below
- one script for xls-r/xlsr/whisper fine-tuning☆40Updated last year
- paraformer(chinense asr) online onnx runtime for python☆40Updated 10 months ago
- An end to end ASR Transformer model training repo☆13Updated 3 years ago
- An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker …☆20Updated 5 months ago
- Official Code for ParrotTTS☆49Updated 4 months ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆39Updated last year
- Generate transcriptions and subtitles using OpenAI whisper as a base model, stable-ts/whisperx as a timestamp stabilizer using ASR models…☆18Updated last year
- faster-whisper livestream translation, OBS noise reduction, dual language subtitles☆77Updated last year
- The case study and multilingfual performance of ICASSP submission☆20Updated 2 years ago
- A Tiny Project For ASR model training and Deployment☆27Updated 2 years ago
- Whisper combined with Silero VAD, for improved long-form transcriptions☆46Updated 2 years ago
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆51Updated 3 years ago
- A project about learning how to synchronize subtitles in movies using machine learning.☆9Updated 2 years ago
- ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level…☆11Updated 2 months ago
- ☆33Updated 3 years ago
- ☆12Updated 2 years ago
- Supervoice Speaker Separation Network☆12Updated 8 months ago
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆26Updated last year
- Zero-Shot Foreign Accent Conversion without a Native Reference☆29Updated 9 months ago
- Curriculum Vitae of Quan Wang☆14Updated last month
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 7 months ago
- Fine-tuning Wav2Vec2.0 on Common Voice(zh-HK)☆14Updated 2 years ago
- Online (real-time) decoder to be used with DeepSpeech2 model☆24Updated 4 years ago
- ☆53Updated 7 months ago
- Putting flows on top of neural transducers for better TTS☆62Updated 2 weeks ago
- A gradio interface for making transcribed and translated subtitles for videos☆34Updated this week
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- ☆11Updated 3 years ago
- Python implementation of CTC beam search decoder + agnostic LM scorer☆19Updated 4 years ago
- ☆25Updated 2 years ago