PoTaTo-Mika / Shore-Data-EngineLinks
A codebase for data crawling and preprocessing for TTS and ASR systems training.
☆22Updated this week
Alternatives and similar repositories for Shore-Data-Engine
Users that are interested in Shore-Data-Engine are comparing it to the libraries listed below
Sorting:
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆50Updated 4 months ago
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆132Updated 4 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆78Updated last year
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆44Updated last year
- Official Repository of Paper: "Towards High-Quality Zero-Shot Singing Voice Conversion in Low-Resource Scenarios"(AAAI 2026)☆86Updated last week
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆30Updated 5 months ago
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆43Updated last year
- [ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.☆116Updated 3 months ago
- Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis☆35Updated last month
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆35Updated 11 months ago
- [ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?☆39Updated 2 months ago
- This repo is text to speech with learnable audio encoder without alignment with transcript reference☆53Updated 4 months ago
- The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…☆62Updated last month
- ☆34Updated 7 months ago
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Updated last year
- The official implementation of the DIFFA series for dLLM-based large audio language model☆56Updated last week
- ☆96Updated 3 months ago
- ☆40Updated 6 months ago
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆110Updated 8 months ago
- Official Code for ParrotTTS☆58Updated last year
- Trainging, inference, and testing of the SAC speech codec model.☆95Updated 3 months ago
- ☆43Updated last year
- Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"☆106Updated last month
- ☆68Updated last month
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆59Updated last year
- Official Repository of UltraVoice☆57Updated 3 months ago
- ☆38Updated last year
- STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation☆70Updated 2 months ago
- An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.☆207Updated 2 weeks ago
- ☆25Updated 7 months ago