ryota-komatsu / speaker_disentangled_hubert
Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"
☆29Updated last month
Related projects ⓘ
Alternatives and complementary repositories for speaker_disentangled_hubert
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆66Updated last week
- Codebase and project page for EDMSound☆29Updated last year
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆42Updated 2 months ago
- GPT-style network for phonemization with durations of text☆62Updated 8 months ago
- Speech enhancement in noisy and reverberant environments using deep neural networks☆15Updated last month
- ☆13Updated 2 months ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆48Updated 3 weeks ago
- Source code for DM-Codec.☆18Updated last month
- ☆23Updated last year
- Implementation of Multi-Source Music Generation with Latent Diffusion.☆18Updated 2 months ago
- Supervoice diffusion enhance☆24Updated 4 months ago
- ☆34Updated 7 months ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆27Updated 3 months ago
- GPT for FACodec☆13Updated 7 months ago
- Unofficial implementation of wavenext vocoder☆32Updated 2 months ago
- This is the official repository of ISMIR 2024 paper "Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional R…☆47Updated 2 months ago
- ESLTTS dataset☆16Updated 5 months ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆42Updated 4 months ago
- ☆32Updated 2 months ago
- ☆36Updated 4 months ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆22Updated 4 months ago
- My vocoder experiments☆21Updated last month
- Code for the paper "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"☆19Updated 2 weeks ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆56Updated 3 weeks ago
- Just another FastSpeech 2 but cleaner code :)☆25Updated 4 months ago
- ☆42Updated last month
- The official implementation of EmoSphere++☆27Updated 2 weeks ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆37Updated 2 weeks ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆70Updated 7 months ago