avishaiElmakies / unsupervised_speech_segmentation_using_slm
☆10Updated this week
Alternatives and similar repositories for unsupervised_speech_segmentation_using_slm:
Users that are interested in unsupervised_speech_segmentation_using_slm are comparing it to the libraries listed below
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆19Updated last year
- ☆10Updated last month
- A pitch detection model trained to be robust against noise and reverberation environments.☆23Updated 4 months ago
- Codec for paper: LLaSA: Scaling Train Time and Test Time Compute for LLaMA based Speech Synthesis.☆39Updated this week
- Source code for DM-Codec.☆33Updated 2 months ago
- ☆25Updated 7 months ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Updated 11 months ago
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆30Updated last year
- Lightweight Speech Representation Learning for One-Shot Voice Conversion☆19Updated 3 weeks ago
- ☆44Updated last year
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…☆35Updated last year
- Event Relation in Text-to-Audio (TTA) Generation☆16Updated last week
- ☆37Updated 6 months ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆41Updated 3 months ago
- ☆53Updated 2 months ago
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆14Updated 4 months ago
- Please visit https://thuhcsi.github.io/SnakeGAN/☆36Updated last year
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆49Updated 2 months ago
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆45Updated 2 weeks ago
- Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…☆15Updated last year
- ☆53Updated 11 months ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆12Updated last month
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆59Updated 9 months ago
- Official repo of ICASSP 2024 paper - Generative De-Quantization for Neural Speech Codec via Latent Diffusion.☆48Updated this week
- A spoken version of the textual story cloze benchmark☆14Updated last year
- Code for the paper "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"☆27Updated last month
- ☆30Updated last year
- Streaming Vocos☆17Updated this week
- Evaluation tool used in the BigVSAN paper☆10Updated 9 months ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆44Updated 6 months ago