daanzu / py-silero-vad-liteLinks
Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies
☆14Updated 6 months ago
Alternatives and similar repositories for py-silero-vad-lite
Users that are interested in py-silero-vad-lite are comparing it to the libraries listed below
Sorting:
- Voice activity engine benchmark framework☆15Updated 3 weeks ago
- Finally, some decent sample sentences☆23Updated last year
- ☆13Updated 9 months ago
- Onnx compatible styletts2 code☆11Updated last month
- ☆12Updated 2 years ago
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆18Updated 2 weeks ago
- ☆10Updated 6 months ago
- ☆13Updated last month
- ☆11Updated 3 years ago
- ☆12Updated 4 months ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆13Updated 8 months ago
- A collection of all our phonemeizers for dataset construction and inference☆23Updated 3 months ago
- Python bindings of speexdsp noise suppression library☆38Updated 2 years ago
- Voice activity detection and speaker gender segmentation audiovisual corpus☆14Updated 4 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated 2 years ago
- ☆11Updated last year
- Cantonese Grapheme-to-Phoneme Converter based on GitYCC/g2pW☆13Updated 5 months ago
- A simple, but performant framework for mapping speech directly to categories and intents.☆20Updated 9 months ago
- Simple inference for Vits2 TTS Using ONNXRUNTIME and espeak-ng on C++☆17Updated last year
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 2 weeks ago
- Speaker change detection using SincNet and an LSTM/Transformer☆51Updated last week
- ☆29Updated last year
- A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.☆29Updated 11 months ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Updated 2 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- Audio-visual diarization pipeline used for creating VoxConverse dataset☆21Updated 3 months ago
- Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model☆32Updated last year
- Export an ONNX graph that performs ISTFT. Designed for TTS models.☆24Updated last year
- C++ version of pyannote audio overlapped speech detection pipeline☆13Updated last year
- Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.☆13Updated 7 months ago