myshell-ai / MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
☆4,398Updated last month
Related projects: ⓘ
- Inference and training library for high-quality TTS models.☆4,193Updated last month
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆4,450Updated last week
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆7,459Updated 2 months ago
- EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine☆7,201Updated last month
- Foundational model for human-like, expressive TTS☆3,721Updated last month
- Brand new TTS solution☆11,190Updated this week
- AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation☆4,498Updated 2 months ago
- Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support e…☆3,133Updated this week
- An Open Source text-to-speech system built by inverting Whisper.☆3,772Updated 3 months ago
- Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.☆3,267Updated 3 weeks ago
- Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key☆5,259Updated 2 months ago
- [SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild☆6,376Updated last month
- V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.☆2,182Updated 2 months ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.☆3,495Updated 2 months ago
- Real time interactive streaming digital human☆3,462Updated last week
- Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>☆4,216Updated 2 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆4,714Updated last month
- Enjoy the magic of Diffusion models!☆6,349Updated this week
- Create Magic Story!☆5,787Updated last month
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆6,363Updated this week
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆4,768Updated last week
- Multilingual Voice Understanding Model☆2,625Updated 2 weeks ago
- Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on☆5,392Updated 4 months ago
- FreeAskInternet is a completely free, PRIVATE and LOCALLY running search aggregator & answer generate using MULTI LLMs, without GPU neede…☆8,451Updated 5 months ago
- tiny vision language model☆4,893Updated 3 weeks ago
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…☆5,939Updated this week
- Open Source framework for voice and multimodal conversational AI☆3,044Updated this week
- Your image is almost there!☆7,207Updated last month
- MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising☆2,318Updated 2 months ago
- OCR, layout analysis, reading order, line detection in 90+ languages☆9,849Updated this week