A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.
☆94Jul 23, 2025Updated 8 months ago
Alternatives and similar repositories for Lip2Speech
Users that are interested in Lip2Speech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- a PyTorch implementation of Lip2Wav☆50Oct 2, 2022Updated 3 years ago
- PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)☆25Mar 9, 2024Updated 2 years ago
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆26Feb 22, 2024Updated 2 years ago
- Visual Speech Recognition for Multiple Languages☆460Aug 17, 2023Updated 2 years ago
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units☆47Oct 26, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- PyTorch implementation of "Lip to Speech Synthesis in the Wild with Multi-task Learning" (ICASSP2023)☆70Mar 9, 2024Updated 2 years ago
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- End-to-end Text-to-Speech with Generative Adversarial Networks☆20Feb 6, 2021Updated 5 years ago
- [CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation☆46Sep 6, 2024Updated last year
- ☆20Jul 13, 2022Updated 3 years ago
- ☆67Aug 16, 2023Updated 2 years ago
- A speech recognition system using 3D CNNs. The final model achieves 97.4% training accuracy and a 99.2% testing accuracy and the system c…☆67Apr 13, 2023Updated 2 years ago
- This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech S…☆713Jul 6, 2023Updated 2 years ago
- Huggingface Implementation of AV-HuBERT on the MuAViC Dataset☆19Mar 6, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis☆151Feb 11, 2023Updated 3 years ago
- PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis☆69Aug 3, 2021Updated 4 years ago
- Bilingual Singing Voice Synthesis☆18Mar 25, 2024Updated 2 years ago
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆98Nov 14, 2024Updated last year
- Official implementation of Meta-StyleSpeech and StyleSpeech☆252Feb 9, 2022Updated 4 years ago
- Speech Parameter Estimation Using Differentiable Speech Synthesizer☆44May 9, 2023Updated 2 years ago
- ☆25Jan 24, 2023Updated 3 years ago
- Automated Lip reading from real-time videos in tensorflow in python☆164Mar 20, 2018Updated 8 years ago
- ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…☆433May 18, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech☆11Jun 30, 2023Updated 2 years ago
- Torch implementation of Whisper-guided DDPM based Voice Conversion☆49Mar 7, 2023Updated 3 years ago
- Official Code implementation for the ICLR paper "LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading"☆86Sep 19, 2024Updated last year
- Repository for the paper: VoiceMe: Personalized voice generation in TTS☆126Apr 29, 2022Updated 3 years ago
- Deep Visual Speech Recognition in arabic words☆16Oct 18, 2023Updated 2 years ago
- LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme…☆23Aug 14, 2025Updated 7 months ago
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆77Jul 16, 2023Updated 2 years ago
- ICASSP 2023 Accepted☆190May 6, 2024Updated last year
- [Interspeech 2024] SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization☆60Mar 26, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆22May 26, 2025Updated 10 months ago
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)☆119Feb 7, 2024Updated 2 years ago
- [SpeechCom Journal] Learning and controlling the source-filter representation of speech with a variational autoencoder☆45Apr 18, 2023Updated 2 years ago
- SelfRemaster: SSL Speech Restoration☆94Jan 5, 2024Updated 2 years ago
- ☆65Oct 8, 2018Updated 7 years ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆41Jan 4, 2026Updated 2 months ago
- ☆171Jul 25, 2022Updated 3 years ago