Sindhu-Hegde / gestsyncLinks
Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023
☆46Updated 10 months ago
Alternatives and similar repositories for gestsync
Users that are interested in gestsync are comparing it to the libraries listed below
Sorting:
- Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound☆128Updated 3 months ago
- TraDiffusion: Trajectory-Based Training-Free Image Generation☆51Updated 8 months ago
- The implementation of "An item is Worth a Prompt: Versatile Image Editing with Disentangled Control"☆74Updated 10 months ago
- ☆74Updated 2 months ago
- This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptati…☆122Updated 5 months ago
- [WACV 2025] Official implementation of "Face Anonymization Made Simple"☆179Updated 3 weeks ago
- Offical code for the CVPR 2024 Paper: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language☆83Updated last year
- ☆51Updated 2 weeks ago
- A Gradio app for analyzing audio files to determine true sample rate and bit depth.☆18Updated 10 months ago
- The official implementation of "A Language Modeling Approach to Diacritic-Free Hebrew TTS"☆100Updated last month
- ☆22Updated 4 months ago
- ☆36Updated 9 months ago
- FG 2024 Papers: Explore a comprehensive collection of research papers presented at one of the premier conferences on automatic face and g…☆14Updated last year
- Video-LlaVA fine-tune for CinePile evaluation☆51Updated 11 months ago
- [AAAI 2025] VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization☆50Updated 7 months ago
- Pytorch implementation of MIMO, Controllable Character Video Synthesis with Spatial Decomposed Modeling, from Alibaba Intelligence Group☆133Updated 9 months ago
- Official repository for "VideoPrism: A Foundational Visual Encoder for Video Understanding" (ICML 2024)☆220Updated this week
- KandinskyVideo — multilingual end-to-end text2video latent diffusion model☆184Updated last year
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆48Updated 5 months ago
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆173Updated 2 months ago
- An official implementation of SwapAnyone.☆65Updated 4 months ago
- [ECCV 2024] Dyadic Interaction Modeling for Social Behavior Generation☆57Updated 2 months ago
- ☆37Updated 10 months ago
- Official repo for VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset..☆185Updated 3 months ago
- ☆61Updated last year
- ☆64Updated last year
- Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos☆22Updated 9 months ago
- Official PyTorch implementation of TokenSet.☆121Updated 3 months ago
- ☆50Updated 2 weeks ago
- Text Behind Video. Enjoy it is completely free.☆33Updated 5 months ago