kyutai-labs / moshivisLinks
Kyutai with an "eye"
☆217Updated 5 months ago
Alternatives and similar repositories for moshivis
Users that are interested in moshivis are comparing it to the libraries listed below
Sorting:
- ☆246Updated 2 weeks ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆282Updated 3 months ago
- ☆516Updated 3 weeks ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆273Updated 4 months ago
- The official GitHub Page for MiniMax☆53Updated 2 months ago
- Official repository for "VideoPrism: A Foundational Visual Encoder for Video Understanding" (ICML 2024)☆294Updated last week
- ☆294Updated 2 months ago
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆308Updated 5 months ago
- ☆155Updated 4 months ago
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆218Updated 3 months ago
- ☆102Updated last year
- The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"☆67Updated last month
- Collection of Open Source Speech Data☆160Updated 10 months ago
- ☆78Updated 4 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated 8 months ago
- GRadient-INformed MoE☆264Updated 11 months ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆116Updated last month
- ☆457Updated 3 months ago
- ☆228Updated 3 months ago
- ☆57Updated 7 months ago
- ☆632Updated last month
- A pipeline parallel training script for LLMs.☆159Updated 4 months ago
- AudioStory: Generating Long-Form Narrative Audio with Large Language Models☆268Updated 2 weeks ago
- ☆447Updated 4 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆92Updated 4 months ago
- Video+code lecture on building nanoGPT from scratch☆69Updated last year
- VLLM Port of the Chatterbox TTS model☆293Updated last week
- Long-form conversational TTS | Community fork☆204Updated last week
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆205Updated 4 months ago
- ☆140Updated 3 weeks ago