rkmt / wesper-demoLinks
☆29Updated last year
Alternatives and similar repositories for wesper-demo
Users that are interested in wesper-demo are comparing it to the libraries listed below
Sorting:
- ☆92Updated 3 weeks ago
- Voice Activity Projection Models: Self-supervised learning of Turn-taking Events☆86Updated last year
- ☆35Updated last year
- Real-time binaural target sound extraction model.☆94Updated last year
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆14Updated 2 weeks ago
- SelfRemaster: SSL Speech Restoration☆93Updated last year
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆98Updated 10 months ago
- PyTorch implementation of WaveFit [2022, Google] which is one of SOTA lightweight/fast speech vocoders.☆62Updated 2 months ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆27Updated 8 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆91Updated 2 years ago
- A sequence-to-sequence voice conversion toolkit.☆106Updated last year
- ☆65Updated last year
- ☆40Updated 3 years ago
- Clustering-based methods for overlapping diarization☆81Updated last year
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆50Updated 8 months ago
- Audio-visual diarization pipeline used for creating VoxConverse dataset☆21Updated 5 months ago
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆30Updated 4 months ago
- ☆22Updated last year
- [SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model☆131Updated last month
- Distillation of Self-Supervised Representation-Based Speech Quality Assessment☆38Updated 6 months ago
- An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)☆130Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆103Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) database☆107Updated 2 months ago
- Unofficial implementation of wavenext vocoder☆52Updated last year
- This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamic…☆53Updated 3 months ago
- A repo containing download guidance and corresponding scripts of the VoxBlink dataset.☆28Updated last year
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆43Updated 2 years ago
- Google's SoundStorm: Efficient Parallel Audio Generation☆130Updated 2 years ago
- This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one …☆39Updated last year
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆146Updated last year