vskadandale/vocalist

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/vskadandale/vocalist)

vskadandale / vocalist

Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices

☆73

Alternatives and similar repositories for vocalist

Users that are interested in vocalist are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xjchenGit / MTDVocaLiST
View on GitHub
Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization (ICASSP 2024).
☆29Apr 3, 2024Updated 2 years ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
DanielMengLiu / DeepLip
View on GitHub
deep-learning based audio-visual lip bometrics
☆15May 9, 2023Updated 3 years ago
CV-IP / VFD
View on GitHub
This is the release code for CVPR2022 paper "Voice-Face Homogeneity Tells Deepfake".
☆15Mar 7, 2022Updated 4 years ago
joonson / syncnet_python
View on GitHub
Out of time: automated lip sync in the wild
☆895Apr 17, 2026Updated 3 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Sxjdwang / TalkLip
View on GitHub
☆429Nov 1, 2023Updated 2 years ago
guanjz20 / StyleSync_PyTorch
View on GitHub
PyTorch implementation of "StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator"
☆215Aug 8, 2023Updated 2 years ago
OpenTalker / StyleHEAT
View on GitHub
[ECCV 2022] StyleHEAT: A framework for high-resolution editable talking face generation
☆656Mar 26, 2023Updated 3 years ago
georgesterpu / avsr-tf1
View on GitHub
Audio-Visual Speech Recognition using Sequence to Sequence Models
☆84Jul 10, 2020Updated 6 years ago
joonson / syncnet_trainer
View on GitHub
Disentangled Speech Embeddings using Cross-Modal Self-Supervision
☆167Apr 12, 2020Updated 6 years ago
JuanFMontesinos / Acappella-YNet
View on GitHub
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21
☆18May 14, 2022Updated 4 years ago
facebookresearch / av_hubert
View on GitHub
A self-supervised learning framework for audio-visual speech
☆996Dec 7, 2023Updated 2 years ago
julianyulu / SyncNetCN
View on GitHub
Optimized Syncnet and Chinese enhanced version, EN and CN checkpoints released
☆11Nov 8, 2021Updated 4 years ago
okankop / ASDNet
View on GitHub
Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset
☆73Jan 18, 2022Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
FuxiVirtualHuman / styletalk
View on GitHub
☆527Dec 26, 2023Updated 2 years ago
ms-dot-k / LRW_ID
View on GitHub
The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…
☆10Oct 12, 2023Updated 2 years ago
SpringHuo / MAVD
View on GitHub
The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…
☆20Apr 22, 2024Updated 2 years ago
AaronComo / LipFD
View on GitHub
[NeurIPS 2024] This is the official repo of the paper "Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Li…
☆139Feb 9, 2025Updated last year
seungheondoh / speech-to-music
View on GitHub
Textless Speech-to-Music Retrieval Using Emotion Similarity [ICASSP23]
☆17Aug 16, 2023Updated 2 years ago
Dianezzy / ParaLip
View on GitHub
Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code
☆109May 1, 2022Updated 4 years ago
evonneng / learning2listen
View on GitHub
Official pytorch implementation for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion (CVPR 2022)
☆129Aug 18, 2024Updated last year
mpc001 / Lipreading_using_Temporal_Convolutional_Networks
View on GitHub
ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…
☆438May 18, 2023Updated 3 years ago
prajwalkr / transpotter
View on GitHub
Official implementation of Transpotter, published in BMVC 2021
☆16Aug 6, 2022Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Sindhu-Hegde / pseudo-visual-speech-denoising
View on GitHub
Official code for the paper "Visual Speech Enhancement Without A Real Visual Stream" published at WACV 2021
☆108May 27, 2024Updated 2 years ago
filby89 / spectre
View on GitHub
Official Pytorch Implementation of SPECTRE: Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos
☆302Mar 24, 2025Updated last year
ahaliassos / LipForensics
View on GitHub
Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection (CVPR 2021)
☆143Feb 1, 2024Updated 2 years ago
zhangchenxu528 / FACIAL
View on GitHub
FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning. ICCV, 2021.
☆380Jun 30, 2022Updated 4 years ago
psyai-net / SelfTalk_release
View on GitHub
This is the official source for our ACM MM 2023 paper "SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking …
☆143Dec 5, 2023Updated 2 years ago
yl4579 / SLMGAN
View on GitHub
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
☆16Jul 19, 2023Updated 3 years ago
NetEase-GameAI / Face2FaceRHO
View on GitHub
The Official PyTorch Implementation for Face2Face^ρ (ECCV2022)
☆226May 6, 2023Updated 3 years ago
zheshiyige / Learning-Diverse-Stochastic-Human-Action-Generators-by-Learning-Smooth-Latent-Transitions
View on GitHub
Code for "Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent Transitions"
☆21Dec 24, 2019Updated 6 years ago
liutaocode / LivePortrait-Train
View on GitHub
Unoffical LivePortrait Training Script [ 🚧 Under Construction]
☆40Jan 28, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
galib360 / FaceXHuBERT
View on GitHub
☆100Oct 30, 2025Updated 8 months ago
MRzzm / HDTF
View on GitHub
the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"
☆429May 12, 2024Updated 2 years ago
MRzzm / DINet
View on GitHub
The source code of "DINet: deformation inpainting network for realistic face visually dubbing on high resolution video."
☆1,128Sep 25, 2023Updated 2 years ago
itsyoavshalev / End-to-End-Lip-Synchronization-with-a-Temporal-AutoEncoder
View on GitHub
☆22Mar 31, 2022Updated 4 years ago
Hangz-nju-cuhk / Talking-Face_PC-AVS
View on GitHub
Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)
☆959Jan 6, 2024Updated 2 years ago
msaadsaeed / SBNet
View on GitHub
Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".
☆13Aug 28, 2023Updated 2 years ago
Sindhu-Hegde / multivsr
View on GitHub
Official code for the paper "Scaling Multilingual Visual Speech Recognition"
☆20Aug 15, 2025Updated 11 months ago