Sindhu-Hegde / gestsyncLinks
Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023
☆45Updated last year
Alternatives and similar repositories for gestsync
Users that are interested in gestsync are comparing it to the libraries listed below
Sorting:
- Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound☆130Updated 5 months ago
- TraDiffusion: Trajectory-Based Training-Free Image Generation☆51Updated 9 months ago
- The implementation of "An item is Worth a Prompt: Versatile Image Editing with Disentangled Control"☆74Updated last year
- This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptati…☆125Updated 6 months ago
- repo for active speaker detection for media videos.☆29Updated last year
- Offical code for the CVPR 2024 Paper: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language☆83Updated last year
- BLIP Live Image Captioning with Real-Time Video Stream This repository provides a Python-based implementation for real-time image captio…☆39Updated 7 months ago
- ☆77Updated 3 months ago
- [WACV 2025] Official implementation of "Face Anonymization Made Simple"☆182Updated 2 months ago
- Combine digital painting with AI image generation.☆143Updated 2 months ago
- ☆36Updated 11 months ago
- ☆55Updated last week
- Enhance faces in AI generated images☆46Updated 2 months ago
- Text Behind Video. Enjoy it is completely free.☆33Updated 6 months ago
- Video-LlaVA fine-tune for CinePile evaluation☆51Updated last year
- Swap your face in real-time☆76Updated 5 months ago
- A Gradio app for analyzing audio files to determine true sample rate and bit depth.☆18Updated 11 months ago
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆177Updated last month
- This lightweight Chrome extension lets you pin and manage your most important Deepseek chats for improved productivity, keeping them easi…☆15Updated 5 months ago
- Official code of the paper: Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis.☆46Updated 11 months ago
- Graph learning framework for long-term video understanding☆66Updated last month
- MBASE, an LLM SDK in C++☆53Updated last month
- Простой IPA фонемизатор на базе ruaccent-encoder☆24Updated 4 months ago
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆49Updated 6 months ago
- Converting Google Maps Screenshot to 3D Model☆21Updated 2 months ago
- FG 2024 Papers: Explore a comprehensive collection of research papers presented at one of the premier conferences on automatic face and g…☆14Updated last year
- ☆61Updated last year
- [AAAI 2025] VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization☆50Updated 8 months ago
- KandinskyVideo — multilingual end-to-end text2video latent diffusion model☆184Updated last year
- ☆84Updated 11 months ago