fishaudio / fish-speech
SOTA Open Source TTS
☆18,396Updated this week
Alternatives and similar repositories for fish-speech:
Users that are interested in fish-speech are comparing it to the libraries listed below
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆9,662Updated this week
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆30,505Updated last week
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆8,063Updated this week
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆8,947Updated this week
- A generative speech model for daily dialogue.☆33,664Updated this week
- High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.☆5,400Updated 3 weeks ago
- Inference and training library for high-quality TTS models.☆4,910Updated last month
- 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)☆38,875Updated 2 weeks ago
- Multilingual Voice Understanding Model☆4,097Updated last week
- Bring portraits to life!☆13,655Updated 2 weeks ago
- Your image is almost there!☆7,468Updated 5 months ago
- 一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with su…☆6,532Updated last month
- A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。☆24,558Updated this week
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…☆7,722Updated this week
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆8,011Updated 6 months ago
- EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine☆7,594Updated 5 months ago
- vits2 backbone with multilingual-bert☆8,183Updated this week
- Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key☆6,814Updated 3 weeks ago
- Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI☆18,641Updated this week
- library & platform to build, distribute, monetize ai apps that have the full context (like rewind, granola, etc.), open source, 100% loca…☆11,575Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆15,474Updated this week
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆36,915Updated 5 months ago
- MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone☆13,445Updated this week
- ☆7,156Updated this week
- Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your p…☆22,748Updated this week
- Industry leading face manipulation platform☆20,969Updated this week
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆6,576Updated this week
- Official inference repo for FLUX.1 models☆19,466Updated last week
- one-click face swap☆29,019Updated 4 months ago
- tiny vision language model☆6,732Updated this week