a-r-r-o-w / stablefusedLinks
StableFused is a toy library for experimenting with Diffusion Models, inspired by various sources.
☆14Updated 2 years ago
Alternatives and similar repositories for stablefused
Users that are interested in stablefused are comparing it to the libraries listed below
Sorting:
- ☆36Updated 2 weeks ago
- liujing04/Retrieval-based-Voice-Conversion-WebUI reconstruction project☆34Updated 2 years ago
- ☆20Updated 3 years ago
- 基于FreeVC的歌声转换☆21Updated 3 years ago
- The official implementation of "A Language Modeling Approach to Diacritic-Free Hebrew TTS"☆103Updated 6 months ago
- Music production for silent film clips.☆30Updated 7 months ago
- Solving Inverse Problems with Diffusion Optimal Control [NeurIPS 2024]☆16Updated 11 months ago
- SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs☆16Updated 2 years ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated last year
- An AR+AR TTS attempt.☆18Updated 11 months ago
- ☆61Updated 2 years ago
- Real-time streaming voice anonymization & voice conversion☆24Updated 3 weeks ago
- AudioLDM text to audio colab☆19Updated 2 years ago
- ☆30Updated last year
- Controlled audio inpainting using SD-fine tuned model Riffusion in a ControlNet Architecture☆32Updated 2 years ago
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆28Updated 2 years ago
- Anim-400K: A dataset designed from the ground up for automated dubbing of video☆110Updated last year
- Misc. tools/scripts that I made to use for tortoise☆21Updated last year
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Updated 11 months ago
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆19Updated 10 months ago
- ☆13Updated 2 years ago
- Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS☆40Updated 2 years ago
- The source code for the paper CrossSinger (asru2023)☆18Updated 2 years ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆35Updated 10 months ago
- ☆68Updated 2 years ago
- 4G GPU & 10 Minutes for train☆12Updated 2 years ago
- ☆46Updated 8 months ago
- Open Source Text-to-Speech GUI Tool running on TalkNet☆11Updated 2 years ago
- [ICASSP 2024] DiffDub: Person-generic visual dubbing using inpainting renderer with diffusion auto-encoder☆67Updated last year
- ☆40Updated last year