Audio-Visual Speech Separation with Cross-Modal Consistency
☆246Jul 25, 2023Updated 2 years ago
Alternatives and similar repositories for VisualVoice
Users that are interested in VisualVoice are comparing it to the libraries listed below
Sorting:
- Deep-Learning-Based Audio-Visual Speech Enhancement and Separation☆219Apr 16, 2023Updated 2 years ago
- ☆42Nov 22, 2024Updated last year
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆18Jul 11, 2022Updated 3 years ago
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Aug 2, 2021Updated 4 years ago
- An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits☆83Apr 28, 2024Updated last year
- A must-read paper for speech separation based on neural networks☆911Aug 11, 2025Updated 6 months ago
- Executable code based on Google articles☆166Dec 8, 2022Updated 3 years ago
- Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments☆111Mar 19, 2024Updated last year
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- ☆62Jun 28, 2023Updated 2 years ago
- Official code release for "RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation", accepted ICLR 2024☆49Oct 14, 2025Updated 4 months ago
- A self-supervised learning framework for audio-visual speech☆970Dec 7, 2023Updated 2 years ago
- StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation☆253Sep 13, 2024Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- [INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.☆59Jan 24, 2024Updated 2 years ago
- ☆18Nov 22, 2024Updated last year
- TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings☆38Oct 27, 2025Updated 4 months ago
- VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer☆35Mar 18, 2023Updated 2 years ago
- This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly i…☆474Jan 9, 2021Updated 5 years ago
- TTS Text Analyzer☆32Jul 20, 2023Updated 2 years ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12May 13, 2024Updated last year
- End-to-end Text-to-Speech with Generative Adversarial Networks☆20Feb 6, 2021Updated 5 years ago
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆77Jul 16, 2023Updated 2 years ago
- 2.5D visual sound☆118Jul 25, 2023Updated 2 years ago
- A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.☆242Feb 15, 2024Updated 2 years ago
- The PyTorch-based audio source separation toolkit for researchers☆2,540Oct 6, 2025Updated 4 months ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- A curated list of different papers and datasets in various areas of audio-visual processing☆766Jan 30, 2024Updated 2 years ago
- Accepted by TMM 2022☆19Aug 18, 2022Updated 3 years ago
- ☆55Jan 13, 2023Updated 3 years ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆101Apr 10, 2025Updated 10 months ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- Official implementation of Meta-StyleSpeech and StyleSpeech☆252Feb 9, 2022Updated 4 years ago
- SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification☆30Mar 24, 2023Updated 2 years ago
- speech enhancement\speech seperation\sound source localization☆1,225Nov 14, 2023Updated 2 years ago
- ☆25Mar 12, 2022Updated 3 years ago
- ☆24Jul 15, 2024Updated last year
- ☆15May 8, 2021Updated 4 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64May 30, 2023Updated 2 years ago