haoheliu / AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
☆228Updated last month
Alternatives and similar repositories for AudioLDM-training-finetuning:
Users that are interested in AudioLDM-training-finetuning are comparing it to the libraries listed below
- MU-LLaMA: Music Understanding Large Language Model☆251Updated 9 months ago
- Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…☆168Updated 9 months ago
- The latent diffusion model for text-to-music generation.☆164Updated 11 months ago
- ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model☆193Updated 8 months ago
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆215Updated 5 months ago
- A lightweight library for Frechet Audio Distance calculation.☆243Updated 4 months ago
- A simple library for Fréchet Audio Distance (FAD) calculation☆170Updated last week
- LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]☆294Updated 9 months ago
- Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.☆182Updated last month
- Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models☆171Updated 7 months ago
- Unofficial download repository for MusicCaps☆45Updated last year
- ☆149Updated 2 weeks ago
- Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stabili…☆155Updated 5 months ago
- This toolbox aims to unify audio generation model evaluation for easier comparison.☆311Updated 3 months ago
- VoiceLDM: Text-to-Speech with Environmental Context☆166Updated 5 months ago
- The Open Source Code of UniAudio☆537Updated 5 months ago
- Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech☆234Updated 10 months ago
- Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for V…☆206Updated 5 months ago
- [IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer☆122Updated 3 weeks ago
- Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".☆332Updated 8 months ago
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"☆324Updated 4 months ago
- [ICCV 2023] Video Background Music Generation: Dataset, Method and Evaluation☆70Updated 9 months ago
- Encode and decode audio samples to/from compressed latent representations!☆168Updated 5 months ago
- PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.☆220Updated 3 months ago
- unofficial implementation of the High Fidelity Neural Audio Compression☆140Updated 5 months ago
- CoMoSVC: One-Step Consistency Model Based Singing Voice Conversion & Singing Voice Clone☆135Updated 9 months ago
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆183Updated 8 months ago
- ☆153Updated last year
- Diffusion and Mutual Information-Based Target Speaker SVS by Learning from Singing Teacher☆178Updated last year
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆91Updated 2 months ago