kyutai-labs / moshivis
Kyutai with an "eye"
☆191Updated last month
Alternatives and similar repositories for moshivis
Users that are interested in moshivis are comparing it to the libraries listed below
Sorting:
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆202Updated last month
- ☆375Updated this week
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆246Updated last month
- ☆304Updated last week
- ☆214Updated last month
- The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"☆49Updated last week
- Building Blocks for Multi-Modal Gradio Powered by Groq Apps☆109Updated 6 months ago
- ☆156Updated last week
- Collection of Open Source Speech Data☆157Updated 6 months ago
- ☆138Updated 3 weeks ago
- ☆57Updated 3 months ago
- ☆101Updated 8 months ago
- LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)☆194Updated this week
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆207Updated this week
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".☆902Updated 6 months ago
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆274Updated last month
- ☆74Updated 7 months ago
- A lightweight end-to-end text-to-speech model☆113Updated 2 months ago
- G2P☆239Updated 2 weeks ago
- ☆154Updated 3 months ago
- [ICML 2025] SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation☆224Updated last month
- ☆69Updated last week
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆264Updated 2 months ago
- ☆171Updated 9 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆148Updated 3 weeks ago
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆556Updated last month
- Googles NotebookLM but local☆238Updated 3 weeks ago
- A pipeline parallel training script for LLMs.☆145Updated 2 weeks ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆108Updated 2 months ago
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆233Updated 8 months ago