skirdey / voicerestore
VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration
☆118Updated last week
Alternatives and similar repositories for voicerestore:
Users that are interested in voicerestore are comparing it to the libraries listed below
- ☆254Updated 11 months ago
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆163Updated this week
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆56Updated this week
- G2P☆107Updated this week
- Google's SoundStorm: Efficient Parallel Audio Generation☆131Updated last year
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆66Updated 4 months ago
- The official Implementation of PeriodWave and PeriodWave-Turbo☆158Updated this week
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆171Updated 4 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆92Updated 4 months ago
- ☆344Updated 5 months ago
- Collection of Open Source Speech Data☆151Updated 3 months ago
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆320Updated this week
- An unofficial PyTorch implementation of VALL-E☆87Updated this week
- ☆62Updated 6 months ago
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆150Updated 7 months ago
- VoiceBox neural network implementation☆101Updated 6 months ago
- Running the F5-TTS by ONNX Runtime☆101Updated this week
- VALL-E 2 reproduction☆111Updated 7 months ago
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".☆131Updated last month
- Uses deepgram/whisper/custom models to create an LJSpeech dataset for voice model fine tuning☆25Updated this week
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 8 months ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆67Updated last year
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆231Updated 5 months ago
- F5-TTS 推理加速,速度提升约4倍!☆42Updated last month
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆98Updated 3 weeks ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆76Updated last month
- ☆112Updated 2 months ago
- ☆94Updated 9 months ago
- ☆272Updated 8 months ago