Bai-YT / ConsistencyTTA
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
☆33Updated last month
Alternatives and similar repositories for ConsistencyTTA:
Users that are interested in ConsistencyTTA are comparing it to the libraries listed below
- ☆37Updated 7 months ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆41Updated 3 months ago
- ☆43Updated 7 months ago
- Code for the paper "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"☆49Updated this week
- Robust Singing Voice Transcription and MIDI Extraction☆66Updated last month
- Unofficial download repository for MusicCaps☆45Updated last year
- PAM is a no-reference audio quality metric for audio generation tasks☆54Updated 5 months ago
- Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.…☆52Updated last year
- Official repo of ICASSP 2024 paper - Generative De-Quantization for Neural Speech Codec via Latent Diffusion.☆48Updated last week
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆51Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆74Updated 3 weeks ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆44Updated 4 months ago
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆70Updated 3 months ago
- Official implementation for FlowSep☆24Updated 2 weeks ago
- ☆54Updated 2 months ago
- Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).☆23Updated 3 months ago
- ☆65Updated last month
- An official implementation of the ICASSP 2024 paper: Dual-Path TFC-TDF UNet for Music Source Separation☆83Updated 9 months ago
- Test code disclosure for the research paper "UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model", as a supplementa…☆19Updated last year
- ☆79Updated last year
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆49Updated 2 months ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆36Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆91Updated 5 months ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Updated 11 months ago
- Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.☆79Updated 4 months ago
- The open source code for SimpleSpeech series☆121Updated 3 months ago
- logWMSE, an audio quality metric & loss function with support for digital silence target. Useful for training and evaluating audio source…☆28Updated 5 months ago
- Official Implementation of EnCLAP (ICASSP 2024)☆90Updated 7 months ago
- ☆24Updated last year
- Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation☆26Updated 10 months ago