kaistmm/FlowAVSE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kaistmm/FlowAVSE)

kaistmm / FlowAVSE

☆27

Alternatives and similar repositories for FlowAVSE

Users that are interested in FlowAVSE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

joanne-b-nortier / UDiffSE
View on GitHub
☆41Feb 1, 2024Updated 2 years ago
ahmadikalkhorani / AVCrossNet
View on GitHub
☆16Jul 4, 2024Updated 2 years ago
kaistmm / TalkNCE
View on GitHub
Official implementation of TalkNCE (ICASSP 2024).
☆18Apr 30, 2025Updated last year
Bose / RAVEN
View on GitHub
☆20Oct 6, 2025Updated 9 months ago
zexupan / reentry
View on GitHub
☆18Nov 22, 2024Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
cogmhear / avse_challenge
View on GitHub
COG-MHEAR Audio-Visual Speech Enhancement Challenge
☆48Feb 17, 2026Updated 5 months ago
RanaCM / DSU-AVO
View on GitHub
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12May 13, 2024Updated 2 years ago
Overcautious / ADENet
View on GitHub
Accepted by TMM 2022
☆19Aug 18, 2022Updated 3 years ago
kaistmm / VoiceDiT
View on GitHub
[ICASSP2025] Official code for VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis
☆52Apr 9, 2025Updated last year
IiuZiKai / Evo_TSE
View on GitHub
☆17Apr 9, 2026Updated 3 months ago
robflynnyh / long-context-asr
View on GitHub
Code for the paper: How Much Context Does My Attention-Based ASR System Need?
☆11Jul 3, 2026Updated 3 weeks ago
yyliu01 / AuralSAM2
View on GitHub
[CVPR'26, Findings] AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting
☆15May 18, 2026Updated 2 months ago
zexupan / MuSE
View on GitHub
☆42Nov 22, 2024Updated last year
kaistmm / V2SFlow
View on GitHub
[ICASSP 2025] V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow
☆21Jun 3, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ms-dot-k / Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆22Apr 11, 2022Updated 4 years ago
nguyenvulebinh / AVSRCocktail
View on GitHub
Audio-Visual Speech Recognition
☆26Jul 7, 2025Updated last year
plnguyen2908 / UniTalk-ASD-code
View on GitHub
[Interspeech 2026] Revisiting Active Speaker Detection: An In-the-Wild Benchmark for Generalization and Robustness
☆22Jun 25, 2026Updated last month
danmic / av-se
View on GitHub
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
☆222Apr 16, 2023Updated 3 years ago
facebookresearch / VisualVoice
View on GitHub
Audio-Visual Speech Separation with Cross-Modal Consistency
☆250Jul 25, 2023Updated 3 years ago
PoKoHA / Speech_Enhancement-DCCRN
View on GitHub
DCCRN: Deep Complex Convolution Recurrent Network
☆14Nov 26, 2021Updated 4 years ago
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated 2 years ago
kaistmm / fregrad
View on GitHub
[ICASSP 2024] Official code for FreGrad
☆35May 13, 2024Updated 2 years ago
gladia-research-group / latent-autoregressive-source-separation
View on GitHub
☆18Apr 28, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
ahaliassos / raven
View on GitHub
Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)
☆82Feb 27, 2025Updated last year
JeongHun0716 / Personalized-Lip-Reading
View on GitHub
Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language (AAAI 2025)
☆24Jun 29, 2026Updated last month
adobe-research / openflam
View on GitHub
OpenFLAM: Framewise Language Audio Model
☆110Jun 4, 2026Updated last month
rikishimizu / MeanFlow-TSE
View on GitHub
☆26Jun 10, 2026Updated last month
arxrean / LipRead-seq2seq
View on GitHub
An unofficial (PyTorch) implementation for the paper Deep Lip Reading: A comparison of models and an online application.
☆10May 13, 2020Updated 6 years ago
eloimoliner / CQT_pytorch
View on GitHub
Pytorch implementation of the invertible CQT based on Non-stationary Gabor filters
☆36Jul 7, 2026Updated 3 weeks ago
TeeJayBaker / PolyDDSP
View on GitHub
Polyphonic generalisation of DDSP
☆22Apr 30, 2024Updated 2 years ago
JuanFMontesinos / VoViT
View on GitHub
VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer
☆35Mar 18, 2023Updated 3 years ago
mutiann / speech_rankings
View on GitHub
A CSRankings-like index for speech researchers
☆35Oct 16, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
vivjay30 / pnf-sampling
View on GitHub
☆22Jun 8, 2021Updated 5 years ago
robd003 / sph2pipe
View on GitHub
provide SPHERE-formatted output as well as RIFF, AU, AIFF and raw
☆14Dec 18, 2021Updated 4 years ago
Louis0324 / DDSP-Articulatory-Vocoder
View on GitHub
☆29Sep 5, 2024Updated last year
BASHLab / OWL
View on GitHub
☆15May 25, 2026Updated 2 months ago
Blinorot / utmos-pytorch
View on GitHub
Unofficial fairseq-free PyTorch implementation of UTMOS (v1, 2022), matching the original system.
☆35Jun 6, 2026Updated last month
zexupan / USEV
View on GitHub
☆14Jul 1, 2024Updated 2 years ago
asteroid-team / pytorch-pit
View on GitHub
Permutation invariant training in PyTorch
☆13Oct 2, 2020Updated 5 years ago