Whisper combined with Silero VAD, for improved long-form transcriptions
☆54Dec 11, 2022Updated 3 years ago
Alternatives and similar repositories for WhisperWithVAD
Users that are interested in WhisperWithVAD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Robust Speech Recognition via Large-Scale Weak Supervision☆19Dec 1, 2022Updated 3 years ago
- zero shot NER fine tuning☆14Mar 17, 2025Updated last year
- Language independent SSL-based Speaker Anonymization system☆19May 28, 2024Updated last year
- [APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Oct 19, 2022Updated 3 years ago
- Joint speech-language model - respond directly to audio!☆30May 13, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A streaming whisper server for on-prem transcription☆23Aug 15, 2024Updated last year
- Simple LPC vocoder in Python☆13Jan 7, 2022Updated 4 years ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆349Nov 12, 2024Updated last year
- Open Source Text-to-Speech GUI Tool running on TalkNet☆11Dec 24, 2022Updated 3 years ago
- ☆14Feb 9, 2023Updated 3 years ago
- A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)☆22Jun 5, 2025Updated 10 months ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆13Oct 28, 2023Updated 2 years ago
- Collaborative transcription service that keeps getting better☆23Nov 8, 2023Updated 2 years ago
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆11Sep 30, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆38Dec 26, 2022Updated 3 years ago
- Datasets for turn-taking research☆19Dec 21, 2023Updated 2 years ago
- ☆10Oct 25, 2019Updated 6 years ago
- AviSynth CUDA Filters☆35Mar 24, 2019Updated 7 years ago
- ☆11May 23, 2023Updated 2 years ago
- Self-Contrastive Learning: Single-viewed Supervised Contrastive Framework using Sub-network (AAAI 2023)☆21Oct 28, 2023Updated 2 years ago
- convert .lab files to .TextGrid files, which can be used in Praat☆14Nov 2, 2018Updated 7 years ago
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- Optimizing speaker verification and spoofing countermeasure systems together with REINFORCE☆13Mar 31, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆49Apr 28, 2023Updated 2 years ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆125Jun 16, 2022Updated 3 years ago
- Implementation of "A conformer-based classifier for variable-length utterance processing in anti-spoofing" published in Interspeech 2023.☆30Nov 7, 2023Updated 2 years ago
- Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit☆13Nov 18, 2022Updated 3 years ago
- 🔊 Text-prompted Generative Audio Model - With the ability to clone voices☆21May 17, 2023Updated 2 years ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆17May 16, 2025Updated 11 months ago
- Hpyformer base FunASR☆30Nov 5, 2024Updated last year
- CosyVoice语音合成简易API☆14Nov 1, 2024Updated last year
- funasr语音转文字的简单api版本,funasr+fastapi,方便部署在服务器上☆13Aug 10, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialog☆68May 18, 2024Updated last year
- ☆12Jul 11, 2024Updated last year
- ☆14Aug 9, 2021Updated 4 years ago
- ASR_LLM_TTS前端项目☆15Dec 3, 2024Updated last year
- ISFMXFW - UI Enhancer For Inno Setup☆10Apr 4, 2023Updated 3 years ago
- Inspecting the Moto Audio application running on Motorola Android devices☆10Aug 28, 2021Updated 4 years ago
- An implementation of Neural Style Transfer for Audio using Pytorch.☆11Dec 14, 2017Updated 8 years ago