Picovoice / falcon
On-device speaker diarization powered by deep learning
☆25Updated this week
Related projects ⓘ
Alternatives and complementary repositories for falcon
- ☆17Updated 3 months ago
- Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…☆21Updated 2 months ago
- On-device noise suppression powered by deep learning☆63Updated last month
- ☆49Updated 9 months ago
- ☆32Updated 2 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆46Updated 5 months ago
- ☆10Updated 3 months ago
- ☆33Updated last year
- An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.☆59Updated 2 years ago
- ONNX Inference of Pyannote Segmentation☆66Updated 2 months ago
- Application for viewing Rich Transcription Time Marked (RTTM) files in an interactive way☆39Updated last year
- ☆27Updated 7 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆71Updated last year
- Fine-Tune Whisper with Transformers and PEFT☆38Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- ☆28Updated last year
- VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.☆46Updated 6 months ago
- Alignment examples for Interspeech 2024☆14Updated 4 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆99Updated last year
- Unofficial implementation of wavenext vocoder☆32Updated 2 months ago
- logWMSE, an audio quality metric & loss function with support for digital silence target. Useful for training and evaluating audio source…☆28Updated 3 months ago
- Differentiable Mean Opinion Score Regularization for Perceptual Speech Enhancement☆22Updated last year
- ☆62Updated 6 months ago
- Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.☆32Updated last year
- Export an ONNX graph that performs ISTFT. Designed for TTS models.