hlt-mt / moselView external linksLinks
Collection of Open Source Speech Data
☆164Oct 3, 2025Updated 4 months ago
Alternatives and similar repositories for mosel
Users that are interested in mosel are comparing it to the libraries listed below
Sorting:
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆67Nov 1, 2024Updated last year
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆24Oct 8, 2025Updated 4 months ago
- ☆30Oct 29, 2024Updated last year
- speaker-disentangled speech linguistic content quantizer☆24Mar 19, 2025Updated 10 months ago
- ☆19Sep 20, 2024Updated last year
- ☆99Jan 19, 2026Updated 3 weeks ago
- The open source code for SimpleSpeech series☆145Oct 8, 2024Updated last year
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'☆153Mar 24, 2025Updated 10 months ago
- A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.☆112Jun 4, 2025Updated 8 months ago
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆145Jan 1, 2025Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 7 months ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- [NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching☆121Mar 27, 2025Updated 10 months ago
- ☆167Sep 19, 2024Updated last year
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- source code of EfficientTTS 2☆20Feb 18, 2024Updated last year
- Text-To-Speech for NotebookLM☆37Jul 20, 2025Updated 6 months ago
- UTokyo-SaruLab MOS Prediction System☆290Feb 6, 2026Updated last week
- Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models☆24Jun 16, 2025Updated 7 months ago
- ☆19May 2, 2024Updated last year
- Reference-aware automatic speech evaluation toolkit☆178Dec 5, 2024Updated last year
- F5-TTS 推理加速,速度提升约4倍!☆122Jan 6, 2025Updated last year
- An AR+AR TTS attempt.☆18Jan 13, 2025Updated last year
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆50Sep 20, 2025Updated 4 months ago
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆51Jun 11, 2024Updated last year
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆214Sep 10, 2024Updated last year
- ☆59Oct 22, 2025Updated 3 months ago
- VITS with phoneme-level prosody modeling based on MaskGIT☆85Aug 31, 2024Updated last year
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆93Mar 12, 2025Updated 11 months ago
- ☆70Sep 3, 2024Updated last year
- Update ASR paper everyday☆452Updated this week
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆108Jan 17, 2025Updated last year
- Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice☆505Dec 22, 2025Updated last month
- ☆23Oct 17, 2024Updated last year
- ☆36Sep 6, 2025Updated 5 months ago
- ☆15Mar 31, 2025Updated 10 months ago
- ☆14Aug 19, 2024Updated last year
- This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…☆197Jan 25, 2026Updated 2 weeks ago
- [Interspeech 2025] DualCodec: A Low-Frame-Rate, Semantically-Enhanced Neural Audio Codec☆61Dec 24, 2025Updated last month