StelaBou / voxceleb_preprocessing
Download and preprocess voxceleb datasets.
☆21Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for voxceleb_preprocessing
- The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…☆16Updated 7 months ago
- Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices☆61Updated 7 months ago
- PyTorch implementation of "Lip to Speech Synthesis in the Wild with Multi-task Learning" (ICASSP2023)☆65Updated 8 months ago
- Authors official PyTorch implementation of the "HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces" [IC…☆79Updated last year
- ☆19Updated last year
- Talking Head from Speech Audio using a Pre-trained Image Generator☆23Updated 6 months ago
- Authors official PyTorch implementation of the "StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment" [FG 20…☆114Updated last year
- ☆86Updated last year
- PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)☆22Updated 8 months ago
- the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"☆94Updated 6 months ago
- Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)☆55Updated 4 months ago
- Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization (ICASSP 2024).☆23Updated 7 months ago
- This is official inference code of PD-FGC☆82Updated last year
- GANalyzer: Analysis and Manipulation of GANs Latent Space for Controllable Face Synthesis☆37Updated 9 months ago
- Tools for downloading VoxCeleb2 dataset☆26Updated 8 months ago
- Official project repo for paper "Speech Driven Video Editing via an Audio-Conditioned Diffusion Model"☆227Updated last year
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆51Updated 9 months ago
- Efficient synchronization from sparse cues☆28Updated 6 months ago
- Code for paper 'EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model'☆186Updated last year
- ☆64Updated 2 years ago
- PolyGlotFake DataSet repository☆14Updated 6 months ago
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆77Updated 5 months ago
- Project of "Adaptive Affine Transformation: A Simple and Effective Operation for Spatial Misaligned Image Generation"☆57Updated last year
- Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code☆106Updated 2 years ago
- [ICCV2023] Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video☆64Updated 7 months ago
- Official Implementation of Visual Transformer Pooling for Lip reading☆36Updated 2 years ago
- An official implementation of "Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encod…☆147Updated last year
- 🎓 Update Talking-Face Research Papers Daily, Now Integrated with LLM Analysis.☆141Updated this week
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units☆25Updated 3 weeks ago
- Disentangled Speech Embeddings using Cross-Modal Self-Supervision☆154Updated 4 years ago