cogmhear/avse_challenge

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cogmhear/avse_challenge)

cogmhear / avse_challenge

COG-MHEAR Audio-Visual Speech Enhancement Challenge

☆48

Alternatives and similar repositories for avse_challenge

Users that are interested in avse_challenge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cogmhear / Intelligibility-Oriented-Audio-Visual-Speech-Enhancement
View on GitHub
Towards Intelligibility-Oriented Audio-Visual Speech Enhancement
☆15Sep 6, 2024Updated last year
BUTSpeechFIT / TS_SUPERB
View on GitHub
☆16Apr 2, 2025Updated last year
kaistmm / FlowAVSE
View on GitHub
☆27Jul 15, 2024Updated 2 years ago
Audio-WestlakeU / RCT
View on GitHub
This repo gives the code for the official implementation of RCT.
☆13Jun 28, 2022Updated 4 years ago
ga642381 / Spoken-Dialogue-Model-Survey
View on GitHub
A survey of spoken dialogue models (SDMs) with speech input and speech output. Focus on their Intermediate Representation and Generation …
☆31Mar 24, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
JuanFMontesinos / VoViT
View on GitHub
VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer
☆35Mar 18, 2023Updated 3 years ago
Audio-WestlakeU / RVAE-EM
View on GitHub
Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutiv…
☆51Mar 6, 2025Updated last year
ahmadikalkhorani / AVCrossNet
View on GitHub
☆16Jul 4, 2024Updated 2 years ago
mispchallenge / misp2022_baseline
View on GitHub
☆33Jun 26, 2023Updated 3 years ago
zexupan / MuSE
View on GitHub
☆42Nov 22, 2024Updated last year
danmic / av-se
View on GitHub
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
☆222Apr 16, 2023Updated 3 years ago
claritychallenge / clarity
View on GitHub
Clarity Challenge toolkit - software for building Clarity Challenge systems
☆189Updated this week
Beilong-Tang / TSELM
View on GitHub
Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models
☆60Apr 14, 2025Updated last year
atosystem / SSL_Interface
View on GitHub
Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024
☆16Nov 19, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ichi131 / Direction-based-BiTSE
View on GitHub
☆15Sep 19, 2024Updated last year
Audio-WestlakeU / pytorch_lightning_template_for_beginners
View on GitHub
A pytorch template for beginners based on pytorch_lightning
☆51Feb 1, 2024Updated 2 years ago
nguyenvulebinh / AVSRCocktail
View on GitHub
Audio-Visual Speech Recognition
☆26Jul 7, 2025Updated last year
chimechallenge / chime6-synchronisation
View on GitHub
Code for synchronising all CHiME-5 audio signals for use in CHiME-6
☆19Dec 2, 2019Updated 6 years ago
zexupan / reentry
View on GitHub
☆18Nov 22, 2024Updated last year
haoxiangsnr / llm-tse
View on GitHub
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
☆43Oct 13, 2023Updated 2 years ago
ddlBoJack / Awesome-Speech-Pretraining
View on GitHub
Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.
☆212Jan 18, 2024Updated 2 years ago
haoxiangsnr / spiking-fullsubnet
View on GitHub
Official repository of Spiking-FullSubNet, the Intel N-DNS Challenge Algorithmic Track Winner.
☆142Jan 28, 2026Updated 6 months ago
nglehuy / semetrics
View on GitHub
Speech Enhancement Metrics (PESQ, CSIG, CBAK, COVL)
☆78Jun 9, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TeaPoly / PLCPA-ASYM-Loss
View on GitHub
The power-law compressed phase-aware asymmetric (PLCPA-ASYM) loss
☆15Sep 4, 2023Updated 2 years ago
desh2608 / gss
View on GitHub
A simple package for Guided source separation (GSS)
☆134May 20, 2024Updated 2 years ago
Andong-Li-speech / EaBNet
View on GitHub
This is the repo of the manuscript "Embedding and Beamforming: All-Neural Causal Beamformer for Multichannel Speech Enhancement", which w…
☆107Jun 10, 2022Updated 4 years ago
NARUTO-2024 / WavBench
View on GitHub
WavBench: Benchmarking Reasoning, Colloquialism, and Paralinguistics for End-to-End Spoken Dialogue Models
☆38Feb 13, 2026Updated 5 months ago
zaocan666 / DyViSE
View on GitHub
Dynamic vision-guided speaker embedding for audio-visual speaker diarization
☆12Jul 5, 2022Updated 4 years ago
alessandroragano / nomad
View on GitHub
NOMAD: Non-Matching Audio Distance (ICASSP 2024)
☆30Jun 17, 2025Updated last year
JaesungHuh / ca-subtitle
View on GitHub
Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"
☆21Nov 3, 2025Updated 8 months ago
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
isjwdu / DFADD
View on GitHub
Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset
☆16Apr 7, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
roger-tseng / av-superb
View on GitHub
A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
☆58Apr 17, 2024Updated 2 years ago
vkothapally / Subband-Beamformer
View on GitHub
☆33Nov 29, 2022Updated 3 years ago
audiolabs / torch-pesq
View on GitHub
PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio
☆228Jul 14, 2023Updated 3 years ago
JusperLee / Swift-Net
View on GitHub
Power-Guided Grouped SRU for Real-Time Causal Audio-Visual Speech Separation
☆26Jul 20, 2026Updated last week
inverse-ai / FINALLY-Speech-Enhancement
View on GitHub
FINALLY: Fast and universal speech enhancement model delivering studio-quality audio for a wide range of recordings.
☆28Apr 1, 2026Updated 4 months ago
facebookresearch / VisualVoice
View on GitHub
Audio-Visual Speech Separation with Cross-Modal Consistency
☆250Jul 25, 2023Updated 3 years ago
stepfun-ai / StepAudio-Skills
View on GitHub
Audio skills for Claw
☆27Apr 16, 2026Updated 3 months ago