IS2AI / SpeakingFaces
A large-scale publicly-available visual-thermal-audio dataset designed to encourage research in the general areas of user authentication, facial recognition, speech recognition, and human-computer interaction.
☆78Updated 3 years ago
Alternatives and similar repositories for SpeakingFaces:
Users that are interested in SpeakingFaces are comparing it to the libraries listed below
- Unsupervised Any-to-many Audiovisual Synthesis via Exemplar Autoencoders☆120Updated 2 years ago
- Official code for the paper "Visual Speech Enhancement Without A Real Visual Stream" published at WACV 2021☆104Updated 8 months ago
- You Said That?: Synthesising Talking Faces from Audio☆69Updated 6 years ago
- Speech-conditioned face generation using Generative Adversarial Networks (ICASSP 2019)☆56Updated 3 years ago
- Facial Expression Feature Extractor☆68Updated 2 years ago
- A tool for facial action unit analysis☆35Updated last year
- Learning Lip Sync of Obama from Speech Audio☆67Updated 4 years ago
- This is the official implementation for IVA'20 Best Paper Award paper "Let's Face It: Probabilistic Multi-modal Interlocutor-aware Gener…☆16Updated 2 years ago
- This github contains the network architectures of NeuralVoicePuppetry.☆79Updated 4 years ago
- Python implementation of the paper " Dynamic Temporal Alignment of Speech to Lips"☆31Updated 5 years ago
- Implementation for Pre-training strategies and datasets for facial representation learning, ECCV 2022☆70Updated last year
- Disentangled Speech Embeddings using Cross-Modal Self-Supervision☆156Updated 4 years ago
- Official pytorch implementation for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion (CVPR 2022)☆109Updated 6 months ago
- The official implementation for ICMI 2020 Best Paper Award "Gesticulator: A framework for semantically-aware speech-driven gesture gener…☆125Updated 2 years ago
- Speech-conditioned face generation using Generative Adversarial Networks☆88Updated 2 years ago
- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)☆65Updated last year
- PATS Dataset. Aligned Pose-Audio-Transcripts and Style for co-speech gesture research☆57Updated last year
- Implementation of NWT, audio-to-video generation, in Pytorch☆88Updated 2 years ago
- CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?☆128Updated 2 months ago
- This repository contains the code for my master thesis on Emotion-Aware Facial Animation☆147Updated 2 years ago
- This repository contains scripts to build Youtube Gesture Dataset.☆121Updated last year
- Tools for downloading VoxCeleb2 dataset☆28Updated 11 months ago
- mirror of VoxCeleb dataset - a large-scale speaker identification dataset☆69Updated 5 years ago
- ☆10Updated 3 months ago
- Talking Face Generation by Conditional Recurrent Adversarial Network☆61Updated 5 years ago
- Aligns faces to the canonical face in both videos and images☆17Updated 2 years ago
- ☆35Updated 6 years ago
- PyTorch implementation of "Lip to Speech Synthesis in the Wild with Multi-task Learning" (ICASSP2023)☆65Updated 11 months ago
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆52Updated last year
- Function to frontalize non-frontal 2D facial landmarks generated from the DLIB library☆23Updated 3 years ago