jjihwan / Voice-CloningLinks
Simple, Unified Repository for Retrieval-based Voice Conversion
☆17Updated last year
Alternatives and similar repositories for Voice-Cloning
Users that are interested in Voice-Cloning are comparing it to the libraries listed below
Sorting:
- ☆10Updated 2 years ago
- Run Retrieval-based Voice Conversion training and inference with ease.☆11Updated 9 months ago
- Multilingual-Speech-Synthesis-Voice-Conversion Using Bark + RVC☆14Updated 6 months ago
- A curated list of resources in audio visual question answering and related area. :-)☆15Updated 4 months ago
- Enabling the use of multiple modalities while prompting Stable Diffusion☆14Updated 3 years ago
- The Land-Diffuser is a novel application of the Denoising Diffusion Probabilistic Model (DDPM) in the realm of 3D Talking Head generation…☆13Updated last year
- ☆10Updated last year
- ☆15Updated last year
- Diffusion Model for Voice Conversion☆17Updated 3 years ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆40Updated last month
- RVC Onnx Infer- Upgraded and simplified-ish☆23Updated last year
- ☆12Updated last year
- Apply an end-to-end model structure (ViT + GPT) to describe images in more detail, rather than traditional image captioning that only pro…☆11Updated 9 months ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆12Updated 11 months ago
- ☆20Updated last year
- [TOMM 2024] Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing☆26Updated last year
- A pipeline to generate user-preferred photo-realistic avatars using stable-diffusion and bayesian-optimization.☆18Updated 5 months ago
- This project predicts wind turbine failure using numerous sensor data by applying classification based ML models that improves prediction…☆10Updated 2 years ago
- A Versatile Face Encoder for Zero-Shot Diffusion Model Personalization☆24Updated 3 months ago
- Towards Fine-grained Audio Captioning with Multimodal Contextual Cues☆81Updated last month
- ☆32Updated last year
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆22Updated 2 months ago
- Talking head animation☆28Updated last year
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Updated 2 years ago
- ☆24Updated last year
- Music production for silent film clips.☆28Updated 6 months ago
- ☆40Updated 3 months ago
- We archive data because we are interested in the diffs. All data is from https://video-api.cartoonnetwork.com. We run the check every min…☆10Updated this week
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆15Updated last year
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆86Updated last year