jjihwan / Voice-CloningLinks
Simple, Unified Repository for Retrieval-based Voice Conversion
☆17Updated last year
Alternatives and similar repositories for Voice-Cloning
Users that are interested in Voice-Cloning are comparing it to the libraries listed below
Sorting:
- ☆10Updated 2 years ago
- Run Retrieval-based Voice Conversion training and inference with ease.☆11Updated 10 months ago
- ☆10Updated last year
- Multilingual-Speech-Synthesis-Voice-Conversion Using Bark + RVC☆14Updated 7 months ago
- Enabling the use of multiple modalities while prompting Stable Diffusion☆15Updated 3 years ago
- The Land-Diffuser is a novel application of the Denoising Diffusion Probabilistic Model (DDPM) in the realm of 3D Talking Head generation…☆13Updated last year
- Diffusion Model for Voice Conversion☆17Updated 3 years ago
- ☆14Updated 2 years ago
- We archive data because we are interested in the diffs. All data is from https://video-api.cartoonnetwork.com. We run the check every min…☆10Updated this week
- Sample and Computation Redistribution for Efficient Face Detection☆15Updated last year
- ☆40Updated 4 months ago
- Apply an end-to-end model structure (ViT + GPT) to describe images in more detail, rather than traditional image captioning that only pro…☆11Updated 10 months ago
- Music production for silent film clips.☆29Updated 7 months ago
- RVC Onnx Infer- Upgraded and simplified-ish☆25Updated last year
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Updated 2 years ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated last year
- ☆14Updated 2 years ago
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆18Updated 9 months ago
- Official code of the paper: Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis.☆45Updated last year
- A collection of all our phonemeizers for dataset construction and inference☆27Updated 9 months ago
- ☆32Updated last year
- Misc. tools/scripts that I made to use for tortoise☆21Updated last year
- A curated list of resources in audio visual question answering and related area. :-)☆16Updated 5 months ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆41Updated 2 months ago
- Daily tracking of awesome aigc papers, including video generation, video editing, animation.☆23Updated 3 months ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆12Updated 11 months ago
- Towards Fine-grained Audio Captioning with Multimodal Contextual Cues☆84Updated 2 months ago
- Talking Face Generation system☆19Updated 2 years ago
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆23Updated 2 months ago
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆87Updated last year