jasonppy / VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
☆8,011Updated 6 months ago
Alternatives and similar repositories for VoiceCraft:
Users that are interested in VoiceCraft are comparing it to the libraries listed below
- Inference and training library for high-quality TTS models.☆4,910Updated last month
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆5,265Updated 5 months ago
- Foundational model for human-like, expressive TTS☆3,979Updated 5 months ago
- An Open Source text-to-speech system built by inverting Whisper.☆4,080Updated last month
- High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.☆5,400Updated 3 weeks ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆8,947Updated this week
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆30,505Updated last week
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆8,063Updated this week
- ☆7,156Updated this week
- tiny vision language model☆6,732Updated this week
- Open Source framework for voice and multimodal conversational AI☆4,299Updated this week
- SOTA Open Source TTS☆18,396Updated this week
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆13,382Updated this week
- Your image is almost there!☆7,468Updated 5 months ago
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,492Updated this week
- AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation☆4,775Updated 6 months ago
- Generative models for conditional audio generation☆2,833Updated last week
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆9,662Updated this week
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,570Updated 5 months ago
- ☆7,938Updated 7 months ago
- AI Browser☆4,226Updated 2 weeks ago
- Private & local AI personal knowledge management app for high entropy people.☆7,515Updated last month
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆3,679Updated last month
- Faster Whisper transcription with CTranslate2☆13,490Updated 2 weeks ago
- Foundational Models for State-of-the-Art Speech and Text Translation☆11,156Updated 2 months ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.☆3,699Updated last week
- Build a Perplexity-Inspired Answer Engine Using Next.js, Groq, Llama-3, Langchain, OpenAI, Upstash, Brave & Serper☆4,775Updated 3 months ago
- TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5,…☆1,947Updated last month
- The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on…☆28,480Updated this week
- A fast multimodal LLM for real-time voice☆2,760Updated this week