imatge-upc / wav2pixView external linksLinks
Speech-conditioned face generation using Generative Adversarial Networks (ICASSP 2019)
☆56Feb 12, 2022Updated 4 years ago
Alternatives and similar repositories for wav2pix
Users that are interested in wav2pix are comparing it to the libraries listed below
Sorting:
- Speech-conditioned face generation using Generative Adversarial Networks☆88Dec 8, 2022Updated 3 years ago
- ☆19Jul 14, 2019Updated 6 years ago
- CVPR 2019☆259May 24, 2023Updated 2 years ago
- Real-time melgan based on cpu !!!☆13Dec 3, 2019Updated 6 years ago
- [NeurIPS 2019] Face Reconstruction from Voice using Generative Adversarial Networks☆194Jan 5, 2020Updated 6 years ago
- Bachelor's thesis carried at Universitat Politecnica de Catalunya in partial fullfilment of the requirements for the degree in Telecommun…☆16Jul 25, 2024Updated last year
- A PyTorch implementation of MIT CSAIL's Speech2Face research paper from IEEE CVPR 2019☆14Mar 25, 2023Updated 2 years ago
- An implementation of http://openaccess.thecvf.com/content_CVPRW_2019/papers/Sight%20and%20Sound/Konstantinos_Vougioukas_End-to-End_Speech…☆18Mar 19, 2020Updated 5 years ago
- ☆21Nov 1, 2018Updated 7 years ago
- Speech-Conditioned Face Generation with Deep Adversarial Networks☆134Feb 17, 2020Updated 6 years ago
- Joint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)☆22Oct 14, 2017Updated 8 years ago
- Prosodic Speech Segmentation with Transformers☆26Feb 25, 2024Updated last year
- Talking Head from Speech Audio using a Pre-trained Image Generator☆23May 7, 2024Updated last year
- devops practise examples☆11Aug 16, 2020Updated 5 years ago
- Code for Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019)☆816May 11, 2021Updated 4 years ago
- Loss function of various types of GANs☆26Oct 5, 2018Updated 7 years ago
- Grapheme-to-Phoneme conversion with Joint-Sequence RnnLMs☆31Dec 15, 2014Updated 11 years ago
- ObamaNet : Photo-realistic lip-sync from audio (Unofficial port)☆238Mar 28, 2018Updated 7 years ago
- Fatcord's Alternative WaveRNN (Faster training)☆125Mar 29, 2019Updated 6 years ago
- Vocode spectrograms to audio with generative adversarial networks☆63Aug 8, 2019Updated 6 years ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆41Jan 4, 2026Updated last month
- ☆208Mar 10, 2021Updated 4 years ago
- Tensorflow Implementation of WaveGlow☆37May 4, 2020Updated 5 years ago
- Mel cepstral distortion (MCD) computations in python.☆229Jun 13, 2017Updated 8 years ago
- Code for paper 'Audio-Driven Emotional Video Portraits'.☆314Mar 16, 2022Updated 3 years ago
- You Said That?: Synthesising Talking Faces from Audio☆70Apr 29, 2018Updated 7 years ago
- An implementation of ObamaNet: Photo-realistic lip-sync from text.☆127Apr 21, 2019Updated 6 years ago
- A python implementation of the Griffin Lim Algorithm for audio reconstruction from magnitudes☆34Jan 17, 2024Updated 2 years ago
- A WaveNet-based vocoder for fast inference☆163Jun 10, 2018Updated 7 years ago
- PrincetonPy's one-day workshop : Introduction to Python for Scientific Computing☆11Nov 4, 2015Updated 10 years ago
- Case studies with Bayesian methods☆10Jul 4, 2023Updated 2 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆19Nov 3, 2025Updated 3 months ago
- ☆10Jul 21, 2019Updated 6 years ago
- This Repository Contains my Microwave Imaging Studies☆11Mar 1, 2016Updated 9 years ago
- ATC-Anno is an annotation tool for Air Traffic Control data that offers automatic semantic and concept annotation.☆12Nov 17, 2023Updated 2 years ago
- Code for Deep Cross modal learning for Caricature Verification and Identification (CaVINet), ACM MM, 2018☆33Apr 15, 2019Updated 6 years ago
- Predicting Political Instability and Social Conflicts Using Multimodal Data☆10Jun 6, 2016Updated 9 years ago
- NeurIPS 2022☆39Nov 23, 2022Updated 3 years ago
- This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of a…☆96May 30, 2020Updated 5 years ago