IIEleven11 / Automatic-Audio-Dataset-Maker
Uses deepgram/whisper/custom models to create an LJSpeech dataset for voice model fine tuning
☆12Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for Automatic-Audio-Dataset-Maker
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆43Updated last month
- ☆83Updated 2 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated 2 weeks ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆93Updated 3 weeks ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆66Updated last week
- The official Implementation of PeriodWave and PeriodWave-Turbo☆132Updated 3 months ago
- VoiceBox neural network implementation☆96Updated 3 months ago
- VALL-E 2 reproduction☆87Updated 4 months ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆61Updated last week
- Supervoice diffusion enhance☆24Updated 4 months ago
- ☆20Updated 3 weeks ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆86Updated last month
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆48Updated last month
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆28Updated 3 weeks ago
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆98Updated 3 weeks ago
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆34Updated 5 months ago
- The official implementation of EmoSphere++☆32Updated 2 weeks ago
- Zero-Shot Emotion Style Transfer☆37Updated 7 months ago
- ☆61Updated 3 months ago
- Running the F5-TTS by ONNX Runtime☆39Updated this week
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆71Updated 7 months ago
- Official Code for ParrotTTS☆43Updated last month
- ☆28Updated last year
- AudioSR-Upsampling (any -> 48kHz)☆38Updated 9 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 5 months ago
- ☆16Updated 6 months ago
- A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS☆28Updated last week
- Text-To-Speech for NotebookLM☆18Updated 3 weeks ago
- ☆59Updated last year