JuanFMontesinos/VoViT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JuanFMontesinos/VoViT)

JuanFMontesinos / VoViT

VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer

☆35

Alternatives and similar repositories for VoViT

Users that are interested in VoViT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

danmic / av-se
View on GitHub
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
☆222Apr 16, 2023Updated 3 years ago
zexupan / reentry
View on GitHub
☆18Nov 22, 2024Updated last year
cogmhear / Intelligibility-Oriented-Audio-Visual-Speech-Enhancement
View on GitHub
Towards Intelligibility-Oriented Audio-Visual Speech Enhancement
☆15Sep 6, 2024Updated last year
JusperLee / Swift-Net
View on GitHub
Power-Guided Grouped SRU for Real-Time Causal Audio-Visual Speech Separation
☆26Jul 20, 2026Updated last week
cogmhear / avse_challenge
View on GitHub
COG-MHEAR Audio-Visual Speech Enhancement Challenge
☆48Feb 17, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kagaminccino / LAVSE
View on GitHub
Python codes for Lite Audio-Visual Speech Enhancement.
☆95May 3, 2024Updated 2 years ago
zexupan / MuSE
View on GitHub
☆42Nov 22, 2024Updated last year
facebookresearch / VisualVoice
View on GitHub
Audio-Visual Speech Separation with Cross-Modal Consistency
☆250Jul 25, 2023Updated 3 years ago
zengchang233 / CrossSinger
View on GitHub
The source code for the paper CrossSinger (asru2023)
☆18Oct 12, 2023Updated 2 years ago
dr-pato / audio_visual_speech_enhancement
View on GitHub
Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
☆112Mar 19, 2024Updated 2 years ago
JuanFMontesinos / Solos
View on GitHub
Solos: A Dataset for Audio-Visual Music Analysis
☆24Feb 17, 2023Updated 3 years ago
RanaCM / DSU-AVO
View on GitHub
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12May 13, 2024Updated 2 years ago
desh2608 / css
View on GitHub
PyTorch implementation of Continuous Speech Separation
☆12Oct 5, 2022Updated 3 years ago
JusperLee / TFACM
View on GitHub
☆24Jul 16, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
saurjya / EnsembleSep
View on GitHub
This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.
☆12Nov 7, 2024Updated last year
NaoyukiKanda / LibriSpeechMix
View on GitHub
☆38Mar 30, 2021Updated 5 years ago
spkgyk / RTFS-Net
View on GitHub
Official code release for "RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation", accepted ICLR 2024
☆51Oct 14, 2025Updated 9 months ago
ms-dot-k / Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆22Apr 11, 2022Updated 4 years ago
kuielab / voice_datasets
View on GitHub
🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).
☆20Apr 1, 2021Updated 5 years ago
Sreyan88 / LipGER
View on GitHub
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆19Jul 16, 2024Updated 2 years ago
morganmcg1 / wandb_spectrogram
View on GitHub
☆15Sep 24, 2022Updated 3 years ago
ahmadikalkhorani / AVCrossNet
View on GitHub
☆16Jul 4, 2024Updated 2 years ago
GiantSteps / drum-pattern-datasets
View on GitHub
This repository holds datasets of polyphonic drum patterns used in the creation of Electronic Dance Music.
☆16Dec 19, 2016Updated 9 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MTG / Podcastmix
View on GitHub
PodcastMix A dataset for separating music and speech in podcasts.
☆44Aug 20, 2024Updated last year
asteroid-team / pytorch_stoi
View on GitHub
STOI loss functions in PyTorch (mirror of https://github.com/mpariente/pytorch_stoi)
☆15Aug 6, 2020Updated 5 years ago
RetroCirce / Choral_Music_Separation
View on GitHub
Chorale Music Separation Dataset and Model Framework
☆41Dec 5, 2022Updated 3 years ago
zaocan666 / DyViSE
View on GitHub
Dynamic vision-guided speaker embedding for audio-visual speaker diarization
☆12Jul 5, 2022Updated 4 years ago
W-Wu / ERC-SLT22
View on GitHub
Code for "Distribution-based Emotion Recognition in Conversation"
☆18Feb 6, 2023Updated 3 years ago
DanielMengLiu / AudioVisualLip
View on GitHub
☆25Feb 20, 2024Updated 2 years ago
georgetz15 / mss-thesis
View on GitHub
Pytorch implementation of MDensenet and sparse NMF. Made for my undergraduate thesis "Music Source Separation with Supervised Learning Me…
☆11Jan 31, 2021Updated 5 years ago
lin9x / AV-Sepformer
View on GitHub
☆65Jun 28, 2023Updated 3 years ago
AlanBaade / MAE-AST-Public
View on GitHub
Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
☆93Jun 9, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
fschmid56 / EfficientAT_HEAR
View on GitHub
Evaluate EfficientAT models on the Holistic Evaluation of Audio Representations Benchmark.
☆34Jun 23, 2023Updated 3 years ago
uark-cviu / Right2Talk
View on GitHub
[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
zexupan / avse_hybrid_loss
View on GitHub
☆16Jun 15, 2022Updated 4 years ago
shleee47 / Sound-Source-Localization
View on GitHub
Sound Source Localization for AI Grand Challenge 2021
☆21Feb 7, 2022Updated 4 years ago
maxreciprocate / offline
View on GitHub
Offline RL experiments
☆15Oct 1, 2022Updated 3 years ago
rgcda / Musisep
View on GitHub
Source Separation on Musical Instrument Sounds
☆39Jan 4, 2022Updated 4 years ago
JusperLee / CTCNet
View on GitHub
An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits
☆82Apr 28, 2024Updated 2 years ago