DataoceanAI / Dolphin
Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.
☆25Updated this week
Alternatives and similar repositories for Dolphin:
Users that are interested in Dolphin are comparing it to the libraries listed below
- ☆26Updated last month
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆85Updated last year
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆48Updated 8 months ago
- ☆64Updated last year
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆48Updated 9 months ago
- Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer☆75Updated last year
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆93Updated 3 months ago
- The open source code for SimpleSpeech series☆137Updated 5 months ago
- All generative model in one for better TTS model☆66Updated 6 months ago
- ☆68Updated 6 months ago
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆128Updated 2 months ago
- A low-bitrate single-codebook 16 kHz speech codec based on focal modulation☆81Updated last month
- ☆28Updated last year
- Official release of StyleTalk dataset.☆62Updated 9 months ago
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆51Updated last year
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated last year
- E2E TTS using Conditional Flow Matching (Experimental*)☆69Updated last year
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆110Updated last week
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆150Updated last year
- [Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers a…☆64Updated last year
- An open-source Kazakh Emotional Text-to-Speech Dataset☆27Updated 11 months ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆79Updated 3 months ago
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆34Updated 9 months ago
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆37Updated 5 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆64Updated 4 months ago
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆37Updated last week
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆139Updated 11 months ago
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".☆127Updated this week
- AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data☆30Updated last year
- ☆36Updated 6 months ago