dr-pato/audio_visual_speech_enhancement

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dr-pato/audio_visual_speech_enhancement)

dr-pato / audio_visual_speech_enhancement

Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments

☆112

Alternatives and similar repositories for audio_visual_speech_enhancement

Users that are interested in audio_visual_speech_enhancement are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

danmic / av-se
View on GitHub
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
☆222Apr 16, 2023Updated 3 years ago
kagaminccino / LAVSE
View on GitHub
Python codes for Lite Audio-Visual Speech Enhancement.
☆95May 3, 2024Updated 2 years ago
craigmacartney / Wave-U-Net-For-Speech-Enhancement
View on GitHub
Improved speech enhancement with the Wave-U-Net, a deep convolutional neural network architecture for audio source separation, implemente…
☆224Mar 24, 2023Updated 3 years ago
aishoot / LSTM_PIT_Speech_Separation
View on GitHub
Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.
☆311Jan 6, 2022Updated 4 years ago
ms-dot-k / LRW_ID
View on GitHub
The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…
☆10Oct 12, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
bill9800 / speech_separation
View on GitHub
Include some core functions and model to handle speech separation
☆156Jun 24, 2021Updated 5 years ago
speechLabBcCuny / onssen
View on GitHub
An open-source speech separation and enhancement library
☆214May 13, 2020Updated 6 years ago
aispeech-lab / advr-avss
View on GitHub
Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.
☆18Jul 11, 2022Updated 4 years ago
lifelongeek / AAS_enhancement
View on GitHub
This repository contains the code and supplementary result for the paper "Unpaired Speech Enhancement by Acoustic and Adversarial Supervi…
☆28Oct 10, 2019Updated 6 years ago
zexupan / MuSE
View on GitHub
☆42Nov 22, 2024Updated last year
JuanFMontesinos / VoViT
View on GitHub
VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer
☆35Mar 18, 2023Updated 3 years ago
facebookresearch / VisualVoice
View on GitHub
Audio-Visual Speech Separation with Cross-Modal Consistency
☆250Jul 25, 2023Updated 3 years ago
smeetrs / deep_avsr
View on GitHub
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
☆244Feb 15, 2024Updated 2 years ago
JusperLee / Looking-to-Listen-at-the-Cocktail-Party
View on GitHub
Executable code based on Google articles
☆166Dec 8, 2022Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
mpc001 / end-to-end-lipreading
View on GitHub
Pytorch code for End-to-End Audiovisual Speech Recognition
☆183Nov 18, 2022Updated 3 years ago
JusperLee / Dual-Path-RNN-Pytorch
View on GitHub
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
☆468Feb 14, 2023Updated 3 years ago
wangkenpu / Conv-TasNet-PyTorch
View on GitHub
A PyTorch implementation of Conv-TasNet
☆46Nov 25, 2019Updated 6 years ago
georgesterpu / avsr-tf1
View on GitHub
Audio-Visual Speech Recognition using Sequence to Sequence Models
☆84Jul 10, 2020Updated 6 years ago
funcwj / conv-tasnet
View on GitHub
A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" (see recipes in aps framework https:/…
☆219Jul 6, 2023Updated 3 years ago
Andong-Li-speech / DARCN
View on GitHub
The implementation of "A Recursive Network with Dynamic Attention for Monaural Speech Enhancement"
☆80Dec 8, 2022Updated 3 years ago
nii-yamagishilab / Intelligibility-MetricGAN
View on GitHub
Implementation for paper "iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric L…
☆56Jul 6, 2023Updated 3 years ago
yakovmon / Real-Time-Audio-Visual-Speech-Enhancement
View on GitHub
☆13May 27, 2019Updated 7 years ago
zexupan / reentry
View on GitHub
☆18Nov 22, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
andi611 / ZeroSpeech-TTS-without-T
View on GitHub
A Pytorch implementation for the ZeroSpeech 2019 challenge.
☆112Nov 12, 2019Updated 6 years ago
funcwj / deep-clustering
View on GitHub
deep clustering method for single-channel speech separation
☆110Jun 21, 2022Updated 4 years ago
mayurnewase / looking-to-listen-at-cocktail-party
View on GitHub
Looking to listen at cocktail party
☆36Mar 24, 2023Updated 3 years ago
uark-cviu / Right2Talk
View on GitHub
[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
dr-pato / SSGD
View on GitHub
Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"
☆15Dec 22, 2022Updated 3 years ago
itsyoavshalev / End-to-End-Lip-Synchronization-with-a-Temporal-AutoEncoder
View on GitHub
☆22Mar 31, 2022Updated 4 years ago
maum-ai / voicefilter
View on GitHub
Unofficial PyTorch implementation of Google AI's VoiceFilter system
☆1,214Jul 25, 2024Updated 2 years ago
francoisgermain / SpeechDenoisingWithDeepFeatureLosses
View on GitHub
Speech Denoising with Deep Feature Losses
☆188Jun 8, 2020Updated 6 years ago
JuanFMontesinos / Acappella-YNet
View on GitHub
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21
☆18May 14, 2022Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
funcwj / uPIT-for-speech-separation
View on GitHub
Speech separation with utterance-level PIT experiments
☆106Jul 12, 2018Updated 8 years ago
haoxiangsnr / IRM-based-Speech-Enhancement-using-LSTM
View on GitHub
Ideal Ratio Mask (IRM) Estimation based Speech Enhancement using LSTM
☆122Nov 20, 2019Updated 6 years ago
jinhan / tacotron2-vae
View on GitHub
Implementation of "Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis"
☆170Jul 6, 2023Updated 3 years ago
haoheliu / Subband-Music-Separation
View on GitHub
Pytorch: Channel-wise subband (CWS) input for better voice and accompaniment separation
☆102Nov 12, 2021Updated 4 years ago
ljuvela / GlottDNN
View on GitHub
GlottDNN vocoder and tools for training DNN excitation models
☆34Feb 27, 2021Updated 5 years ago
haoxiangsnr / A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement
View on GitHub
A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorc…
☆350Sep 5, 2020Updated 5 years ago
JusperLee / LRS3-For-Speech-Separation
View on GitHub
Multi-modal speech separation task data generation script on LRS3 data set.
☆88Feb 2, 2024Updated 2 years ago