dsforza96 / visual-mic
Passive Recovery of Sound from Video
☆41Updated 4 years ago
Alternatives and similar repositories for visual-mic:
Users that are interested in visual-mic are comparing it to the libraries listed below
- When sound hits an object, it causes small vibrations on the object’s surface. Here we show how, using only high-speed video of the objec…☆11Updated 5 months ago
- This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) wit…☆168Updated 4 years ago
- Implementation of the CVPR 2019 Paper - Speech2Face: Learning the Face Behind a Voice by MIT CSAIL☆173Updated last year
- Python implementation of EVM(Eulerian Video Magnification)☆32Updated 2 years ago
- Binary classification problem that aims to classify human voices from audio recordings. Implemented using PyTorch and Librosa.☆33Updated 3 years ago
- Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style tr…☆897Updated last year
- Generating Sentiment-aware Visual Stories using Cross-modal Music Translation☆71Updated 5 years ago
- Identify the emotion of multiple speakers in an Audio Segment☆166Updated 2 years ago
- Python implementation of EVM(Eulerian Video Magnification)☆233Updated 2 years ago
- Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)☆356Updated 7 months ago
- This repository provides a python implementation of eulerian motion magnification.☆25Updated 3 years ago
- Audio-Visual Speech Separation with Cross-Modal Consistency☆227Updated last year
- A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.☆243Updated 2 years ago
- A reproduction of Eulerian Video Magnification for Revealing Subtle Changes in the World☆10Updated 3 years ago
- Client-side air drawing tool☆184Updated 2 years ago
- Perceptual Metrics of Audio - perceptually relevant loss function. DPAM and CDPAM☆358Updated last year
- Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code☆431Updated last year
- A curated list of different papers and datasets in various areas of audio-visual processing☆691Updated last year
- Face synthetics datasets☆842Updated 2 months ago
- Long-Inference, High Quality Synthetic Speaker (AI avatar/ AI presenter)☆250Updated last year
- Pytorch Implementation of wavegan model to generate audio☆163Updated 4 years ago
- Automated Lip reading from real-time videos in tensorflow in python☆160Updated 6 years ago
- Clone a voice in 5 seconds to generate arbitrary speech in real-time☆16Updated last year
- AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss☆1,035Updated 4 months ago
- Code for IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds☆117Updated 2 months ago
- Deep Learning - one shot learning for speaker recognition using Filter Banks☆164Updated 7 months ago
- This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf☆377Updated 2 years ago
- Docker files for Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression☆381Updated last year
- Python package for openSMILE☆267Updated 2 months ago
- The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022☆193Updated 2 years ago