cmeraki / audiotoken
Audio tokenization, in the fastest way possible!
☆45Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for audiotoken
- Collection of scripts from mHuBERT-147.☆22Updated 4 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆50Updated 3 years ago
- Official Code for ParrotTTS☆41Updated 3 weeks ago
- ☆59Updated last year
- GPT for FACodec☆13Updated 7 months ago
- GPT-style network for phonemization with durations of text☆62Updated 7 months ago
- ☆32Updated last month
- Supervoice diffusion enhance☆25Updated 3 months ago
- ☆84Updated 7 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆12Updated 5 months ago
- Implementation of Google's USM speech model in Pytorch☆25Updated this week
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆65Updated 7 months ago
- Temporary anonymous version☆22Updated 7 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- ☆43Updated 4 months ago
- Zero-Shot Foreign Accent Conversion without a Native Reference☆28Updated 6 months ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆46Updated last month
- ☆28Updated this week
- ☆56Updated last year
- Finetuning VITS Efficiently☆32Updated last year
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆36Updated last year
- Official release of StyleTalk dataset.☆57Updated 4 months ago
- A TTS model that makes a speaker speak new languages☆75Updated 4 months ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 8 months ago
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆29Updated 3 months ago
- ☆19Updated last year
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆90Updated 2 weeks ago
- Running the F5-TTS by ONNX Runtime☆25Updated this week
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆35Updated 3 weeks ago