Whisper combined with Silero VAD, for improved long-form transcriptions
☆54Dec 11, 2022Updated 3 years ago
Alternatives and similar repositories for WhisperWithVAD
Users that are interested in WhisperWithVAD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Robust Speech Recognition via Large-Scale Weak Supervision☆19Dec 1, 2022Updated 3 years ago
- zero shot NER fine tuning☆14Mar 17, 2025Updated last year
- Language independent SSL-based Speaker Anonymization system☆20May 28, 2024Updated 2 years ago
- Examples of demo deployment using Gradio. Image Classification, Live Webcam Segmentation, APIs , Tunneling etc.☆17Oct 17, 2022Updated 3 years ago
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Joint speech-language model - respond directly to audio!☆30May 13, 2024Updated 2 years ago
- generate granular word-level captions in srt format☆57Sep 26, 2022Updated 3 years ago
- Simple LPC vocoder in Python☆13Jan 7, 2022Updated 4 years ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆350Nov 12, 2024Updated last year
- Open Source Text-to-Speech GUI Tool running on TalkNet☆11Dec 24, 2022Updated 3 years ago
- ☆14Feb 9, 2023Updated 3 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal.☆14Sep 19, 2022Updated 3 years ago
- A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)☆25Jun 5, 2025Updated 11 months ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆13Oct 28, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Datasets for turn-taking research☆19Dec 21, 2023Updated 2 years ago
- AviSynth CUDA Filters☆35Mar 24, 2019Updated 7 years ago
- ☆11May 23, 2023Updated 3 years ago
- Self-Contrastive Learning: Single-viewed Supervised Contrastive Framework using Sub-network (AAAI 2023)☆21Oct 28, 2023Updated 2 years ago
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- Optimizing speaker verification and spoofing countermeasure systems together with REINFORCE☆13Mar 31, 2021Updated 5 years ago
- NVIDIA's TalkNET - Train and Synthesize on colab☆15Dec 6, 2025Updated 5 months ago
- ☆49Apr 28, 2023Updated 3 years ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆125Jun 16, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Implementation of "A conformer-based classifier for variable-length utterance processing in anti-spoofing" published in Interspeech 2023.☆31Nov 7, 2023Updated 2 years ago
- Hpyformer base FunASR☆31Nov 5, 2024Updated last year
- CosyVoice语音合成简易API☆14Nov 1, 2024Updated last year
- funasr语音转文字的简单api版本,funasr+fastapi,方便部署在服务器上☆13Aug 10, 2024Updated last year
- TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialog☆68May 18, 2024Updated 2 years ago
- This repository includes the code to reproduce our paper [Explainable deepfake and spoofing detection: an attack analysis using SHapley A…☆12Jan 24, 2024Updated 2 years ago
- 这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech recognition API implemented using FunASR.☆26Feb 12, 2026Updated 3 months ago
- ☆12Jul 11, 2024Updated last year
- ☆14Aug 9, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and t…☆27Mar 30, 2026Updated last month
- 基于wenet的短时在线语音识别服务☆11Feb 25, 2023Updated 3 years ago
- <综合> Funasr语音识别,调用Qwen大模型回答,通过GPTSovits输出语音的ai程序,其中调用模型还是在线,后续将添加离线大模型☆13Nov 30, 2024Updated last year
- ASR_LLM_TTS前端项目☆15Dec 3, 2024Updated last year
- SpeechDenoiser: Real-Time Speech Denoising with ONNX Welcome to SpeechDenoiser, a simple and effective solution for real-time speech den…☆115Aug 16, 2024Updated last year
- An implementation of Neural Style Transfer for Audio using Pytorch.☆11Dec 14, 2017Updated 8 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 5 years ago