xinshengwang / S2IGANLinks
Pytorch Code for S2IGAN
☆41Updated 5 years ago
Alternatives and similar repositories for S2IGAN
Users that are interested in S2IGAN are comparing it to the libraries listed below
Sorting:
- Implementation of Differential Learning Rate in Keras☆11Updated 6 years ago
- Two-stage GANs that generate fingerstyle guitarist images from audio.☆59Updated 7 years ago
- Speech-conditioned face generation using Generative Adversarial Networks☆88Updated 3 years ago
- Code base for WaveTransformer: A novel architecture for automated audio captioning☆44Updated 4 years ago
- This sample includes simeple CNN classifier for music and audio-folder dataloader just like ImageFolder in torchvision.☆11Updated 7 years ago
- Pytorch code for the paper 'Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acousti…☆14Updated 5 years ago
- Speech-conditioned face generation using Generative Adversarial Networks (ICASSP 2019)☆56Updated 3 years ago
- Implementation of Multistream Transformers in Pytorch☆54Updated 4 years ago
- Implementation of NWT, audio-to-video generation, in Pytorch☆92Updated 3 years ago
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Updated 4 years ago
- Feature extractor for DL speech processing.☆66Updated 3 years ago
- Unsupervised Any-to-many Audiovisual Synthesis via Exemplar Autoencoders☆122Updated 3 years ago
- Comprehensive Python library for speech and voice.☆32Updated 3 years ago
- ☆25Updated 6 years ago
- ☆25Updated 7 years ago
- bumble bee transformer☆14Updated 4 years ago
- ☆27Updated 6 years ago
- Source code for "Towards a Deeper Understanding of Adversarial Losses under a Discriminative Adversarial Network Setting"☆42Updated 3 years ago
- Anonymous ICLR Submission☆14Updated 6 years ago
- A Pytorch Implementation of MelNet☆26Updated 5 years ago
- Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch☆59Updated 4 years ago
- Pytorch implementation of sparse_image_warp and an example of GoogleBrain's SpecAugment is given: A Simple Data Augmentation Method for A…☆24Updated 6 years ago
- mirror of VoxCeleb dataset - a large-scale speaker identification dataset☆73Updated 6 years ago
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆118Updated 3 years ago
- This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"☆15Updated 2 years ago
- Python code for handling the Clotho dataset.☆85Updated 5 years ago
- Code for paper "direct speech-to-image translation"☆27Updated 5 years ago
- ☆10Updated last year
- ☆16Updated 4 years ago
- Implementation of "FastSpeech: Fast, Robust and Controllable Text to Speech"☆64Updated 2 years ago