v-iashin / SpecVQGANLinks

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

☆369

Alternatives and similar repositories for SpecVQGAN

Users that are interested in SpecVQGAN are comparing it to the libraries listed below

Sorting:

descriptinc / lyrebird-wav2clip
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP
☆355Updated 3 years ago
RoySheffer / im2wav
Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation
☆124Updated 2 years ago
PeihaoChen / regnet
Official PyTorch implementation of the TIP paper "Generating Visually Aligned Sound from Videos" and the corresponding Visually Aligned S…
☆54Updated 4 years ago
yangdongchao / Text-to-sound-Synthesis
The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"
☆363Updated 2 years ago
haoheliu / audioldm_eval
This toolbox aims to unify audio generation model evaluation for easier comparison.
☆365Updated last year
Kinyugo / msanii
A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.
☆195Updated 2 years ago
RetroCirce / MusicLDM
The latent diffusion model for text-to-music generation.
☆181Updated last year
haoheliu / AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
☆284Updated 11 months ago
archinetai / audio-diffusion-pytorch-trainer
Trainer for audio-diffusion-pytorch
☆130Updated 2 years ago
gladia-research-group / multi-source-diffusion-models
☆172Updated 2 years ago
seungheondoh / music-text-representation
Toward Universal Text-to-Music-Retrieval (TTMR) [ICASSP23]
☆114Updated 2 years ago
happylittlecat2333 / Auffusion
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…
☆190Updated last year
XYPB / CondFoleyGen
Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".
☆91Updated last year
archinetai / a-unet
A toolbox that provides hackable building blocks for generic 1D/2D/3D UNets, in PyTorch.
☆89Updated 2 years ago
YatingMusic / ddsp-singing-vocoders
Official implementation of SawSing (ISMIR'22)
☆269Updated 3 years ago
XinhaoMei / WavCaps
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
☆251Updated last year
gudgud96 / frechet-audio-distance
A lightweight library for Frechet Audio Distance calculation.
☆302Updated this week
L-YeZhu / D2M-GAN
[ECCV2022] D2M-GAN for music generation from dance videos
☆85Updated 3 years ago
RetroCirce / Zero_Shot_Audio_Source_Separation
The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022
☆210Updated 3 years ago
archinetai / audio-encoders-pytorch
A collection of audio autoencoders, in PyTorch.
☆44Updated 2 years ago
tencent-ailab / bddm
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
☆230Updated 3 years ago
wzk1015 / video-bgm-generation
[ACM MM 2021 Best Paper Award] Video Background Music Generation with Controllable Music Transformer
☆321Updated 6 months ago
cdjkim / audiocaps
🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps
☆199Updated 2 months ago
v-iashin / Synchformer
Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)
☆95Updated 2 months ago
MoonInTheRiver / NeuralSVB
Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code
☆453Updated last year
SonyCSLParis / music2latent
Encode and decode audio samples to/from compressed latent representations!
☆240Updated 2 months ago
guyyariv / AudioToken
This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …
☆87Updated last year
shansongliu / MU-LLaMA
MU-LLaMA: Music Understanding Large Language Model
☆296Updated 3 months ago
seungheondoh / music_caps_dl
Unofficial download repository for MusicCaps
☆48Updated 2 years ago
zhuole1025 / SymMV
[ICCV 2023] Video Background Music Generation: Dataset, Method and Evaluation
☆77Updated last year