IIEleven11 / Automatic-Audio-Dataset-Maker
Uses deepgram/whisper/custom models to create an LJSpeech dataset for voice model fine tuning
☆16Updated 2 weeks ago
Alternatives and similar repositories for Automatic-Audio-Dataset-Maker:
Users that are interested in Automatic-Audio-Dataset-Maker are comparing it to the libraries listed below
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆53Updated last month
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆65Updated 3 months ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆96Updated this week
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆67Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆74Updated 3 weeks ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆169Updated 3 months ago
- The official Implementation of PeriodWave and PeriodWave-Turbo☆146Updated last month
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆65Updated 2 months ago
- ☆23Updated 2 months ago
- Official repository of Wavehax vocoder☆45Updated last month
- All generative model in one for better TTS model☆66Updated 4 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆90Updated 3 months ago
- The official implementation of EmoSphere++☆67Updated 2 months ago
- Implementation of RIFT-SVC, a singing voice conversion model based on Rectified Flow Transformer.☆26Updated this week
- An unofficial PyTorch implementation of VALL-E☆87Updated this week
- Official Code for ParrotTTS☆46Updated 3 months ago
- VALL-E 2 reproduction☆109Updated 6 months ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆78Updated 9 months ago
- ☆18Updated 8 months ago
- VoiceBox neural network implementation☆100Updated 5 months ago
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆34Updated 7 months ago
- ☆28Updated last year
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆114Updated this week
- VietTTS: An Open-Source Vietnamese Text to Speech☆23Updated last month
- ☆65Updated 4 months ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆49Updated 2 months ago
- Zero-Shot Emotion Style Transfer☆39Updated 9 months ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated 10 months ago