bigcash / awesome-vadLinks
A curated list of awesome voice activity detection
☆71Updated last year
Alternatives and similar repositories for awesome-vad
Users that are interested in awesome-vad are comparing it to the libraries listed below
Sorting:
- Speaker change detection using SincNet and an LSTM/Transformer☆56Updated 7 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆92Updated 2 years ago
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆143Updated 3 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆101Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) database☆113Updated last month
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆179Updated last year
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆104Updated 9 months ago
- ONNX Inference of Pyannote Segmentation☆97Updated last year
- Tunable pipelines☆41Updated 4 months ago
- SpeechDenoiser: Real-Time Speech Denoising with ONNX Welcome to SpeechDenoiser, a simple and effective solution for real-time speech den…☆110Updated last year
- This is the audio sample repository for speech separation model "MossFormer2".☆166Updated last year
- An unofficial PyTorch implementation of VALL-E☆88Updated 5 months ago
- On-device voice activity detection (VAD) powered by deep learning☆241Updated this week
- Add n-gram and large language model (LLM) support to Whisper models.☆40Updated 8 months ago
- ☆76Updated 3 months ago
- a lightweight voice conversion☆86Updated last year
- Putting flows on top of neural transducers for better TTS☆64Updated last month
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆104Updated last year
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆67Updated 3 years ago
- Python bindings of speexdsp noise suppression library☆46Updated 3 years ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆111Updated last year
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆266Updated last year
- ☆46Updated last year
- VoiceBox neural network implementation☆110Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆150Updated 2 years ago
- ☆29Updated 11 months ago
- ☆58Updated last year
- ☆94Updated 2 months ago
- We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction☆173Updated this week
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago