SWivid / Habibi-TTSView external linksLinks
Official code for "Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis"
☆56Feb 3, 2026Updated last week
Alternatives and similar repositories for Habibi-TTS
Users that are interested in Habibi-TTS are comparing it to the libraries listed below
Sorting:
- Open-Ended Speaking Style Modeling via Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training☆62Feb 7, 2026Updated last week
- MOSS-Speech is a true speech-to-speech large language model without text guidance.☆122Dec 4, 2025Updated 2 months ago
- ☆11Mar 22, 2023Updated 2 years ago
- This repository provides an implementation of the DPCCN model for single-channel speech separation. More details will be updated soon.☆13Dec 8, 2021Updated 4 years ago
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆50Sep 2, 2025Updated 5 months ago
- WIP Tensorflow implementation of https://github.com/mozilla/TTS☆15Apr 11, 2020Updated 5 years ago
- Code for the paper "MULTI-BAND MASKING FOR WAVEFORM-BASED SINGING VOICE SEPARATION" that was accepted on EUSIPCO2022☆15Jun 18, 2022Updated 3 years ago
- Dataset☆28Jul 31, 2025Updated 6 months ago
- This repository provides information on how to use the SINS database along with some example code. The SINS Dataset is composed of conti…☆23Dec 23, 2022Updated 3 years ago
- An implementation of the ICASSP paper 'A fully convolutional neural network for complex spectrogram processing in speech enhancement'☆21Feb 19, 2020Updated 5 years ago
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆56Jun 25, 2024Updated last year
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 8 months ago
- ☆82Dec 31, 2025Updated last month
- Causal streaming adaptation of OpenAI Whisper for real-time transcription on small audio chunks.☆62Sep 18, 2025Updated 4 months ago
- Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations☆62Jan 16, 2025Updated last year
- ☆99Jan 19, 2026Updated 3 weeks ago
- SpeechJudge: Towards Human-Level Judgment for Speech Naturalness (https://arxiv.org/abs/2511.07931)☆56Dec 23, 2025Updated last month
- The official implementation of the DIFFA series for dLLM-based large audio language model☆59Feb 2, 2026Updated last week
- Export an ONNX graph that performs ISTFT. Designed for TTS models.☆27Apr 23, 2024Updated last year
- A benchmark for evaluating audio encoders on various audio tasks.☆42Dec 11, 2025Updated 2 months ago
- Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.☆71Jan 14, 2026Updated last month
- FireRedASR2S is a SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandari…☆199Updated this week
- [NeurIPS' 25] Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆189Dec 9, 2025Updated 2 months ago
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆123Sep 2, 2025Updated 5 months ago
- [INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"☆64Jun 16, 2025Updated 7 months ago
- The KlicStudio MCP server is a connector based on the Model Context Protocol (MCP), designed to facilitate interactions with KlicStudio s…☆19Jul 30, 2025Updated 6 months ago
- ☆94Oct 16, 2025Updated 3 months ago
- Text Normalization utilities for normalizing text for TTS☆20Updated this week
- Code for reproducing the paper "Neural Networks Fail to Learn Periodic Functions and How to Fix It" as part of the ML Reproducibility Cha…☆11Apr 16, 2021Updated 4 years ago
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆43Oct 28, 2024Updated last year
- Trainging, inference, and testing of the SAC speech codec model.☆96Nov 1, 2025Updated 3 months ago
- [NeurIPS 2025] Benchmark data and code for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix☆195Dec 13, 2025Updated 2 months ago
- Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control☆88Jan 19, 2026Updated 3 weeks ago
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆54May 15, 2025Updated 8 months ago
- An image fusion techniques presented in “Poisson image editing", P. Pérez, M. Gangnet, and A. Blake, SIGGRAPH 2003.☆14Jan 13, 2020Updated 6 years ago
- Mini Callcenter Simulator simulates a call center and takes into account many parameters not covered by the Erlang C formula.☆12Jan 23, 2026Updated 3 weeks ago
- VS Code tools for NextBASIC☆12Apr 22, 2025Updated 9 months ago
- ☆10Oct 20, 2022Updated 3 years ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated 10 months ago