groundcat / Google-AI-video-transcribe-subtitle-generator
Transcribes video using GCP speech-to-text and generates .SRT subtitles
☆16Updated last year
Alternatives and similar repositories for Google-AI-video-transcribe-subtitle-generator:
Users that are interested in Google-AI-video-transcribe-subtitle-generator are comparing it to the libraries listed below
- Generate transcriptions and subtitles using OpenAI whisper as a base model, stable-ts/whisperx as a timestamp stabilizer using ASR models…☆18Updated last year
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆51Updated 3 years ago
- An end to end ASR Transformer model training repo☆13Updated 3 years ago
- A project about learning how to synchronize subtitles in movies using machine learning.☆9Updated 2 years ago
- A gradio interface for making transcribed and translated subtitles for videos☆34Updated this week
- An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker …☆20Updated 5 months ago
- ToneNet: A CNN Model of Tone Classification of Mandarin Chinese☆17Updated 5 years ago
- mirror of VoxCeleb dataset - a large-scale speaker identification dataset☆69Updated 5 years ago
- Generate subtitle files with timelines in an automatic way.☆62Updated 2 years ago
- paraformer(chinense asr) online onnx runtime for python☆40Updated 10 months ago
- (已过时)WaveNet 声码器☆21Updated 4 years ago
- one script for xls-r/xlsr/whisper fine-tuning☆40Updated last year
- Deep learning using CNN for Mandarin Chinese tone classification☆33Updated 5 years ago
- AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data☆70Updated 3 years ago
- A packaged convolutional voice activity detector for noisy environments.☆14Updated 5 years ago
- Fine-tuning Wav2Vec2.0 on Common Voice(zh-HK)☆14Updated 2 years ago
- Online (real-time) decoder to be used with DeepSpeech2 model☆24Updated 4 years ago
- Transferability of cross-lingual and cross-age speech emotion recognition☆18Updated last year
- ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level…☆11Updated 2 months ago
- A streamlit application that lets you explore the effect of different audio augmentation techniques☆27Updated 2 years ago
- Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recogni…☆24Updated 3 years ago
- ☆33Updated 3 years ago
- ☆29Updated 5 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 7 months ago
- wav2vec2 asr with transformers☆16Updated 3 years ago
- Python interface to the WebRTC Noise Suppression☆18Updated 3 years ago
- Running Mozilla's implementation of Baidu DeepSpeech on Google Colaboratory☆16Updated 5 years ago
- ☆15Updated 5 years ago
- Curriculum Vitae of Quan Wang☆14Updated last month
- Mispronunciation detection code for jingju singing voice☆20Updated 6 years ago