fishaudio / fish-speech
SOTA Open Source TTS
☆19,207Updated this week
Alternatives and similar repositories for fish-speech:
Users that are interested in fish-speech are comparing it to the libraries listed below
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆10,766Updated this week
- Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key☆7,396Updated 2 weeks ago
- High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.☆5,578Updated last month
- Inference and training library for high-quality TTS models.☆5,025Updated 2 months ago
- A generative speech model for daily dialogue.☆34,495Updated this week
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆9,735Updated this week
- Multilingual Voice Understanding Model☆4,491Updated last month
- Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切…☆11,242Updated this week
- Real time interactive streaming digital human☆4,590Updated 2 weeks ago
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆30,988Updated last month
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆8,508Updated 2 weeks ago
- 一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with su…☆6,690Updated 2 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆7,506Updated last week
- Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support e…☆4,913Updated this week
- 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)☆40,757Updated this week
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…☆8,229Updated this week
- Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。☆11,834Updated this week
- EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine☆7,666Updated 6 months ago
- 🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.☆26,234Updated this week
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆6,914Updated last week
- Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.☆4,167Updated 5 months ago
- ⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。☆14,765Updated last month
- vits2 backbone with multilingual-bert☆8,248Updated last week
- MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone☆18,531Updated this week
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆7,908Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆16,314Updated this week
- Bring portraits to life!☆14,060Updated last week
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcri…☆5,920Updated this week
- Open source real-time translation app for Android that runs locally☆7,352Updated last month
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆21,892Updated 3 weeks ago