Magicboomliu / Viseme-Classification
A pipeline from Dataset Gathering,Data annotations, Model training,Model Evaluation for viseme (visual sound phoneme) classification
☆14Updated 4 years ago
Alternatives and similar repositories for Viseme-Classification:
Users that are interested in Viseme-Classification are comparing it to the libraries listed below
- The code generate phoneme from audio features.☆28Updated 3 years ago
- 3D Avatar Lip Synchronization from speech (JALI based face-rigging)☆78Updated 2 years ago
- Pytorch reimplementation of audio driven face mesh or blendshape models, including Audio2Mesh, VOCA, etc☆14Updated 6 months ago
- ☆44Updated last year
- CPU inference version of VisemeNet-tensorflow☆14Updated 5 years ago
- ☆94Updated 3 years ago
- ☆100Updated last year
- Code for "SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces" ACM MM 2023☆30Updated last year
- Blender add-on to implement VOCA neural network.☆59Updated 2 years ago
- PoseAI LiveLink Compatible on macOS☆8Updated 2 years ago
- PyTorch implementation of NEUTART, a system that creates photorealistic talking avatars from an input text transcription.☆33Updated 2 weeks ago
- A novel apporach for personalized speech-driven 3D facial animation☆47Updated 11 months ago
- Music to Dance for 3D Avatar☆15Updated 3 years ago
- Speech to Facial Animation using GANs☆41Updated 3 years ago
- SyncTalkFace: Talking Face Generation for Precise Lip-syncing via Audio-Lip Memory☆33Updated 2 years ago
- SAiD: Blendshape-based Audio-Driven Speech Animation with Diffusion☆101Updated last year
- Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness (ICASSP 202…☆64Updated last year
- Drive your metahuman to speak within 1 second.☆4Updated last week
- ☆11Updated 2 years ago
- ☆15Updated 6 months ago
- FaceFormer Emo: Speech-Driven 3D Facial Animation with Emotion Embedding☆25Updated last year
- Speech-Driven Expression Blendshape Based on Single-Layer Self-attention Network (AIWIN 2022)☆76Updated 2 years ago
- Crystal TTVS engine is a real-time audio-visual Multilingual speech synthesizer with a 3D expressive avatar.☆84Updated 4 years ago
- ☆32Updated last year
- Spliting the ASR probability distribution results into the chinese pinyin, so as to extract more effective feature for the chinese speech…☆21Updated 2 years ago
- ☆13Updated last year
- Audio-Visual Lip Synthesis via Intermediate Landmark Representation☆16Updated last year
- PersonaTalk Hack☆13Updated 2 months ago
- 中文到表情☆29Updated 2 years ago
- Daily tracking of awesome avatar papers, including 2d talking head, 3d head avatar, body avatar.☆63Updated last week