plnguyen2908 / LASER_ASDLinks
[WACV 2026] LASER: Lip Landmark Assisted Speaker Detection for Robustness official implemntation
☆20Updated last month
Alternatives and similar repositories for LASER_ASD
Users that are interested in LASER_ASD are comparing it to the libraries listed below
Sorting:
- ☆62Updated 6 months ago
- Code for "SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces" ACM MM 2023☆30Updated 2 years ago
- ☆20Updated 3 years ago
- The project page repo for Neural Dubber.☆30Updated 2 years ago
- FLM-Audio is a audio-language subversion of RoboEgo/FLM-Ego -- an omnimodal model with native full duplexity.☆53Updated last month
- [AAAI 2025] VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization☆53Updated last year
- We present a model that can generate accurate 3D sound fields of human bodies from headset microphones and body pose as inputs.☆89Updated last year
- Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness (ICASSP 202…☆72Updated last year
- ☆37Updated last year
- Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos☆25Updated last year
- Official Repo for MoCha Towards Movie-Grade Talking Character Synthesis☆60Updated 2 weeks ago
- This is an official PyTorch implementation of "Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gestu…☆26Updated last year
- Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))☆48Updated last year
- Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.☆32Updated 7 months ago
- Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis☆40Updated 2 years ago
- Implementation for the paper "Can Language Models Learn to Listen?"☆69Updated 2 years ago
- video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is d…☆136Updated 2 weeks ago
- ☆20Updated last year
- Foundation Models and Data for Human-Human and Human-AI interactions.☆329Updated 3 weeks ago
- The ReprGesture entry to the GENEA Challenge 2022 (IMCI 2022)☆16Updated 3 years ago
- Official code for the paper "Understanding Co-speech Gestures in-the-wild"☆20Updated 2 months ago
- repo for active speaker detection for media videos.☆30Updated 2 years ago
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated last year
- ☆55Updated 6 months ago
- ☆22Updated 2 months ago
- Offical code for the CVPR 2024 Paper: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language☆85Updated last year
- [AAAI 2025] The official repository of UniMuMo☆126Updated 3 months ago
- [ECCV 2024] Dyadic Interaction Modeling for Social Behavior Generation☆62Updated 8 months ago
- Daily tracking of awesome avatar papers, including 2d talking head, 3d head avatar, body avatar.☆77Updated 3 months ago
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆188Updated last year