jjihwan / Voice-CloningLinks
Simple, Unified Repository for Retrieval-based Voice Conversion
☆17Updated last year
Alternatives and similar repositories for Voice-Cloning
Users that are interested in Voice-Cloning are comparing it to the libraries listed below
Sorting:
- Run Retrieval-based Voice Conversion training and inference with ease.☆11Updated 10 months ago
- Enabling the use of multiple modalities while prompting Stable Diffusion☆15Updated 3 years ago
- ☆10Updated last year
- ☆10Updated 2 years ago
- Multilingual-Speech-Synthesis-Voice-Conversion Using Bark + RVC☆14Updated 8 months ago
- A Versatile Face Encoder for Zero-Shot Diffusion Model Personalization☆24Updated 5 months ago
- The Land-Diffuser is a novel application of the Denoising Diffusion Probabilistic Model (DDPM) in the realm of 3D Talking Head generation…☆13Updated last year
- Make any person bald!! Component of the paper: Learning to regulate 3D head shape by removing occluding hair from in-the-wild images.☆12Updated 3 years ago
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated last year
- Find the aesthetic score of your images using a neural network predictor☆15Updated 9 months ago
- Music production for silent film clips.☆30Updated 7 months ago
- ☆15Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated last year
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆23Updated last week
- Apply an end-to-end model structure (ViT + GPT) to describe images in more detail, rather than traditional image captioning that only pro…☆11Updated 11 months ago
- Misc. tools/scripts that I made to use for tortoise☆21Updated last year
- Diffusion Model for Voice Conversion☆17Updated 3 years ago
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models".☆25Updated last month
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆42Updated 3 months ago
- Daily tracking of awesome aigc papers, including video generation, video editing, animation.☆24Updated 4 months ago
- ☆41Updated 5 months ago
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆19Updated 10 months ago
- High-performance ASR tool using Faster Whisper, supporting custom models, multi-language transcription, and real-time processing feedback…☆10Updated 3 months ago
- Sample and Computation Redistribution for Efficient Face Detection☆15Updated last year
- ☆28Updated 11 months ago
- ☆14Updated 2 years ago
- Fine-tune of Florence-2 for shot categorization.☆26Updated 9 months ago
- Official Implementation for "StyleDomain: Efficient and Lightweight Parameterizations of StyleGAN for One-shot and Few-shot Domain Adapta…☆30Updated last year
- Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)☆17Updated last year
- RVC Onnx Infer- Upgraded and simplified-ish☆25Updated last year