Sindhu-Hegde / gestsync
Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023
☆35Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for gestsync
- Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound☆116Updated last week
- The implementation of "An item is Worth a Prompt: Versatile Image Editing with Disentangled Control"☆64Updated 2 months ago
- The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)☆57Updated last month
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆43Updated 2 weeks ago
- TraDiffusion: Trajectory-Based Training-Free Image Generation☆50Updated last week
- Vico: Compositional Video Generation as Flow Equalization☆52Updated this week
- Official PyTorch Implementation of "FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner"☆60Updated last month
- Interactive Video Generation via Masked-Diffusion☆70Updated 7 months ago
- ☆78Updated 3 months ago
- Official Implementation of weights2weights☆121Updated last month
- Pytorch implementation of MIMO, Controllable Character Video Synthesis with Spatial Decomposed Modeling, from Alibaba Intelligence Group☆127Updated last month
- ☆27Updated last month
- Offical code for the CVPR 2024 Paper: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language☆61Updated 5 months ago
- code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection☆21Updated last year
- Official code of the paper: Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis.☆40Updated 2 months ago
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆77Updated 5 months ago
- Video-LlaVA fine-tune for CinePile evaluation☆38Updated 3 months ago
- The codes of Siggraph Asia 2024 paper "Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation"☆33Updated 2 months ago
- Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos☆15Updated last month
- Experiencing lightning fast (~1s) and accurate drag-based image editing☆55Updated 3 weeks ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆30Updated last month
- ☆2Updated last month
- Paint by Inpaint: Learning to Add Image Objects by Removing Them First☆87Updated 2 months ago
- [CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners☆128Updated 4 months ago
- FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality☆164Updated last week
- Training-and-pormpt Free General Painterly Image Harmonization Using image-wise attention sharing☆52Updated 5 months ago
- ☆60Updated last year
- [ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, M…☆17Updated last month
- Official implementation for "pOps: Photo-Inspired Diffusion Operators"☆70Updated 3 months ago