StelaBou / voxceleb_preprocessing
Download and preprocess voxceleb datasets.
☆20Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for voxceleb_preprocessing
- Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices☆61Updated 7 months ago
- Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)☆55Updated 3 months ago
- Efficient synchronization from sparse cues☆28Updated 6 months ago
- Authors official PyTorch implementation of the "HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces" [IC…☆79Updated last year
- PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)☆21Updated 8 months ago
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆76Updated 4 months ago
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆50Updated 9 months ago
- The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…☆16Updated 6 months ago
- PyTorch implementation of "Lip to Speech Synthesis in the Wild with Multi-task Learning" (ICASSP2023)☆62Updated 8 months ago
- GANalyzer: Analysis and Manipulation of GANs Latent Space for Controllable Face Synthesis☆37Updated 8 months ago
- Talking Head from Speech Audio using a Pre-trained Image Generator☆23Updated 6 months ago
- Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization (ICASSP 2024).☆23Updated 7 months ago
- the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"☆93Updated 5 months ago
- Authors official PyTorch implementation of the "StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment" [FG 20…☆114Updated last year
- Tools for downloading VoxCeleb2 dataset☆26Updated 7 months ago
- SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization (Interspeech 2024)☆21Updated 2 weeks ago
- MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]☆244Updated 4 months ago
- Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code☆106Updated 2 years ago
- Official Implementation of Visual Transformer Pooling for Lip reading☆36Updated 2 years ago
- [WACV 2024] FG-Net: Facial Action Unit Detection with Generalizable Pyramidal Features☆16Updated 4 months ago
- ☆86Updated last year
- PyTorch implementation of slicing adversarial network (SAN)☆90Updated 4 months ago
- Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".☆75Updated 11 months ago
- ☆45Updated last year
- Disentangled Speech Embeddings using Cross-Modal Self-Supervision☆154Updated 4 years ago
- Pre-trained model weights of MAE-Face.☆25Updated 9 months ago
- PolyGlotFake DataSet repository☆14Updated 5 months ago
- An official implementation of "Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encod…☆147Updated last year
- ☆48Updated last year
- ☆22Updated 7 months ago