MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenarios, covering stable long‑form speech, multi‑speaker dialogue, voice/character design, environmental sound effects, and real‑time streaming TTS.
☆984Mar 23, 2026Updated last week
Alternatives and similar repositories for MOSS-TTS
Users that are interested in MOSS-TTS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code repo for EffectMaker: Unifying Reasoning and Generation for Customized Visual Effect Creation☆38Mar 6, 2026Updated 3 weeks ago
- ☆24Jul 20, 2025Updated 8 months ago
- MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, i…☆171Mar 6, 2026Updated 3 weeks ago
- FreeFuse: Multi-Subject LoRA Fusion via Adaptive Token-Level Routing at Test Time☆167Mar 17, 2026Updated last week
- MOVA: Towards Scalable and Synchronized Video–Audio Generation☆854Mar 14, 2026Updated 2 weeks ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Code2Worlds: Empowering Coding LLMs for 4D World Generation☆96Feb 26, 2026Updated last month
- a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation☆61Mar 31, 2025Updated 11 months ago
- We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that S…☆285Mar 21, 2026Updated last week
- This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs☆89Sep 19, 2025Updated 6 months ago
- [CVPR'26] VecGlypher: Unified Vector Glyph Generation with Language Models☆113Feb 26, 2026Updated last month
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆24Sep 9, 2024Updated last year
- [ICCV 2025] Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping☆92Nov 30, 2025Updated 4 months ago
- Official repo for paper "SK-Adapter: Skeleton-Based Structural Control for Native 3D Generation".☆51Mar 22, 2026Updated last week
- Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs☆33Dec 9, 2025Updated 3 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Official code of "RoboOmni: Proactive Robot Manipulation in Omni-modal Context"☆90Nov 17, 2025Updated 4 months ago
- Official Codebase for our CVPR 2026 paper UniSH: Unifying Scene and Human Reconstruction in a Feed-Forward Pass☆137Feb 24, 2026Updated last month
- Code for 'JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion'☆226Feb 10, 2026Updated last month
- MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flex…☆1,218Mar 23, 2026Updated last week
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆137Sep 19, 2025Updated 6 months ago
- [CVPR 2026] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset☆587Oct 29, 2025Updated 5 months ago
- This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…☆76Jan 25, 2026Updated 2 months ago
- The power-law compressed phase-aware asymmetric (PLCPA-ASYM) loss☆14Sep 4, 2023Updated 2 years ago
- A curated list of awesome resources about reward construction for AI agents. This repository covers cutting-edge research, and practical …☆59Sep 1, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Official Implementation of ReCo: Region-Constraint In-Context Generation for Instructional Video Editing☆151Mar 5, 2026Updated 3 weeks ago
- Official inference code for SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis☆498Updated this week
- ComfyUI custom nodes for Fish Audio S2-Pro TTS — voice clone, multi-speaker, and text-to-speech☆140Mar 22, 2026Updated last week
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆22Jan 25, 2025Updated last year
- [CVPR 2026] Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"☆85Feb 13, 2026Updated last month
- A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.☆108May 5, 2025Updated 10 months ago
- Towards Systematic Measurement for Long Text Quality☆38Sep 5, 2024Updated last year
- CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!☆123Aug 8, 2025Updated 7 months ago
- ☆68Jan 12, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- TTS-Story is a web-based multi‑voice TTS studio for turning tagged scripts into audiobooks—featuring full speaker management, chunk revie…☆112Updated this week
- ☆82Mar 7, 2026Updated 3 weeks ago
- Ming-omni-tts: Simple and Efficient Unified Generation of Speech, Music, and Sound with Precise Control☆208Feb 26, 2026Updated last month
- 🌋LavaSR: Fast Speech restoration and enhancement☆482Mar 10, 2026Updated 2 weeks ago
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- DreamStyle: A Unified Framework for Video Stylization☆115Jan 7, 2026Updated 2 months ago
- ☆50Feb 12, 2026Updated last month