SocAIty / Retrieval-based-Voice-Conversion-FastAPILinks
Adds a web API to RVC to infer via json requests
☆27Updated last year
Alternatives and similar repositories for Retrieval-based-Voice-Conversion-FastAPI
Users that are interested in Retrieval-based-Voice-Conversion-FastAPI are comparing it to the libraries listed below
Sorting:
- API server for Instant voice cloning by MyShell.☆99Updated 11 months ago
- A random walk voice style cloning application for Kokoro text to speech☆124Updated 2 months ago
- ☆67Updated 5 months ago
- SoTA open-source TTS☆69Updated last week
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆159Updated last year
- ☆99Updated last year
- Streaming and Fine-tuning for Chatterbox TTS☆164Updated 2 months ago
- Since the owner of the repo took it down and it used an MIT license, I guess it's okay to upload it here for people to use.☆51Updated 5 months ago
- fast state-of-the-art speech models and a runtime that runs anywhere 💥☆55Updated 2 months ago
- ☆70Updated 3 weeks ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆23Updated 5 months ago
- Examples of using the llasa-tts models locally☆180Updated 4 months ago
- XTTSv2 Extension for oobabooga text-generation-webui☆155Updated last year
- ☆45Updated 7 months ago
- ☆51Updated 9 months ago
- Simulates talk with an AI that can express emotions☆78Updated 2 months ago
- A very simple implementation of edge_tts w/ RVC for oobabooga text-generation-webui.☆41Updated last year
- Realtime tts reading of large textfiles by your favourite voice. +Translation via LLM (Python script)☆52Updated 10 months ago
- Quantized text-audio foundation model from Boson AI☆25Updated 2 weeks ago
- ☆83Updated last year
- This repo provides a simple Gradio UI to run Qwen2 VL 72B AWQ in venv and have both image and video inferencing work.☆30Updated 10 months ago
- (Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on …☆100Updated 3 weeks ago
- A UI for the Piper TTS☆98Updated last year
- A functioning Sesame CSM project with a desktop GUI - Real-time factor: 0.6x with 4070 Ti Super - Requires only 8GB VRAM☆55Updated 3 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆31Updated 4 months ago
- Service for testing out the new Qwen2.5 omni model☆57Updated 4 months ago
- This extension enhances the capabilities of textgen-webui by integrating advanced vision models, allowing users to have contextualized co…☆57Updated 10 months ago
- TTS pipeline that uses RVC to enhance audio quality and cloning☆145Updated last year
- Made slight modifications to the Tortoise API, provided 3 additional scripts to make using Tortoise easier. Less focus on cloning makes s…☆52Updated last year
- AI 3D avatar voice interface in browser. VAD -> STT -> LLM -> TTS -> VRM (Prototype/Proof-of-Concept)☆71Updated 2 years ago