☆27Jun 27, 2023Updated 2 years ago
Alternatives and similar repositories for cncvs_data_collector
Users that are interested in cncvs_data_collector are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…☆20Apr 22, 2024Updated 2 years ago
- Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)☆23Apr 27, 2024Updated 2 years ago
- ☆19Jun 29, 2025Updated 11 months ago
- wav2lip in a Vector Quantized (VQ) space☆27Jun 20, 2023Updated 2 years ago
- An unofficial (PyTorch) implementation for the paper Deep Lip Reading: A comparison of models and an online application.☆10May 13, 2020Updated 6 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language (AAAI 2025)☆24Mar 17, 2025Updated last year
- FaceFormer Emo: Speech-Driven 3D Facial Animation with Emotion Embedding☆27Jul 15, 2023Updated 2 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Visual Speech Recongnition☆20Dec 24, 2024Updated last year
- ☆15Oct 10, 2023Updated 2 years ago
- livekit agent plugins☆47Apr 21, 2026Updated last month
- Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer☆37Apr 17, 2026Updated last month
- TaskNya是一个通用的任务监控与通知工具,适用于通用任务的监控和通知。☆12May 12, 2026Updated last month
- optimized wav2lip☆18Jan 6, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Speech-Driven Expression Blendshape Based on Single-Layer Self-attention Network (AIWIN 2022)☆78Oct 21, 2022Updated 3 years ago
- [BMVC'24] G3FA: Geometry-guided GAN for Face Animation☆20Mar 14, 2025Updated last year
- ☆10Feb 17, 2023Updated 3 years ago
- ☆21Dec 9, 2023Updated 2 years ago
- Voice conversion with just linear regression.☆37Sep 25, 2025Updated 8 months ago
- [ICCV 2025] Official repo of "EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow"☆27Oct 16, 2025Updated 7 months ago
- ☆56Dec 20, 2023Updated 2 years ago
- ☆103Nov 26, 2025Updated 6 months ago
- ☆10Nov 19, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆29Oct 1, 2023Updated 2 years ago
- [IJCAI2022] Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast☆21Oct 25, 2023Updated 2 years ago
- ☆21Mar 4, 2024Updated 2 years ago
- Tsinghua University SPMI Lab array processing toolkit☆18Nov 23, 2016Updated 9 years ago
- SyncNet for Time Synchronization☆30Mar 13, 2023Updated 3 years ago
- Prompting Large Language Models with Audio for General-Purpose Speech Summarization☆20May 14, 2025Updated last year
- ☆15Oct 28, 2019Updated 6 years ago
- The official code for paper: GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic Expressions…☆24May 28, 2026Updated 2 weeks ago
- CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?☆130Dec 11, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An implementation of http://openaccess.thecvf.com/content_CVPRW_2019/papers/Sight%20and%20Sound/Konstantinos_Vougioukas_End-to-End_Speech…☆18Mar 19, 2020Updated 6 years ago
- ☆45Jan 10, 2026Updated 5 months ago
- Auto-AVSR: Lip-Reading Sentences Project☆423Jan 8, 2025Updated last year
- Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models☆23Apr 15, 2024Updated 2 years ago
- 这是一个在wav2lip,使用wav2lip、gfpgan、yolov5等模型用RT加速的超快推理!经测试在2070显卡上可达到0.03秒每帧实现实时推理。☆31Sep 23, 2025Updated 8 months ago
- ☆24Oct 8, 2021Updated 4 years ago
- MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]☆302Jul 7, 2024Updated last year