bigcash / awesome-vadLinks
A curated list of awesome voice activity detection
☆70Updated last year
Alternatives and similar repositories for awesome-vad
Users that are interested in awesome-vad are comparing it to the libraries listed below
Sorting:
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆92Updated 2 years ago
- Tunable pipelines☆41Updated 3 months ago
- Add n-gram and large language model (LLM) support to Whisper models.☆40Updated 7 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆100Updated 11 months ago
- SpeechDenoiser: Real-Time Speech Denoising with ONNX Welcome to SpeechDenoiser, a simple and effective solution for real-time speech den…☆107Updated last year
- ONNX Inference of Pyannote Segmentation☆97Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆56Updated 7 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆108Updated 3 weeks ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆177Updated last year
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆102Updated 9 months ago
- On-device voice activity detection (VAD) powered by deep learning☆238Updated last week
- This is the audio sample repository for speech separation model "MossFormer2".☆161Updated last year
- ☆156Updated 3 weeks ago
- ☆74Updated 2 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆121Updated 2 years ago
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆137Updated 2 months ago
- ☆58Updated last year
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆159Updated 2 weeks ago
- Python Wrapper of Silero VAD☆63Updated 7 months ago
- We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction☆168Updated this week
- ☆94Updated last month
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Updated 2 years ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆67Updated 3 years ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆151Updated 6 months ago
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆141Updated 7 months ago
- Putting flows on top of neural transducers for better TTS☆64Updated 3 weeks ago
- Fine-Tune Whisper with Transformers and PEFT☆58Updated 2 years ago
- Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup☆79Updated 6 months ago
- ☆44Updated last year