TaoRuijie/MFV-KSD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TaoRuijie/MFV-KSD)

TaoRuijie / MFV-KSD

Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)

☆22

Alternatives and similar repositories for MFV-KSD

Users that are interested in MFV-KSD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xiaoxiaomiao323 / MSA
View on GitHub
☆16Feb 19, 2026Updated 5 months ago
mavceleb / mavceleb_baseline
View on GitHub
☆11Nov 5, 2025Updated 8 months ago
msaadsaeed / SBNet
View on GitHub
Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".
☆13Aug 28, 2023Updated 2 years ago
ductuantruong / enskd
View on GitHub
[ICASSP'24] Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification
☆16Mar 20, 2024Updated 2 years ago
Jiang-Yidi / TS-TalkNet
View on GitHub
INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues
☆61May 29, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jagabandhumishra / W2V-E2E-Language-Diarization
View on GitHub
☆11Sep 4, 2023Updated 2 years ago
HaoFengyuan / EEND-IAAE
View on GitHub
The implementation of "End-to-End Neural Speaker Diarization with an Iterative Adaptive Attractor Estimation", which is accepted by Neura…
☆11Aug 27, 2023Updated 2 years ago
X-LANCE / MSDWILD
View on GitHub
[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.
☆66Jan 24, 2024Updated 2 years ago
desh2608 / diarizer
View on GitHub
Clustering-based methods for overlapping diarization
☆84Jan 12, 2024Updated 2 years ago
zds-potato / multilingual-phonetic-sv
View on GitHub
☆10Dec 22, 2023Updated 2 years ago
SSTC-Challenge / SSTC2024_baseline_system
View on GitHub
☆12Jun 14, 2024Updated 2 years ago
JunyiPeng00 / SLT22_MultiHead-Factorized-Attentive-Pooling
View on GitHub
An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
☆24Sep 22, 2024Updated last year
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
zcxu-eric / AVA-AVD
View on GitHub
☆51Nov 24, 2022Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
ZXHY-82 / w2v-BERT-2.0_SV
View on GitHub
☆53Mar 28, 2026Updated 4 months ago
qinxiaoyi / TimeVarying_ASV
View on GitHub
☆12Oct 17, 2024Updated last year
wngh1187 / ExU-Net
View on GitHub
Pytorch implementation of Extended U-Net for Speaker Verification in Noisy Environments
☆28Jul 24, 2023Updated 3 years ago
DongKeon / Awesome-Speaker-Diarization
View on GitHub
Some comprehensive papers about speaker diarization
☆368Mar 24, 2026Updated 4 months ago
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated 2 years ago
msaadsaeed / FOP
View on GitHub
Official implementation of FOP method as described in "Fusion and Orthogonal Projection for Improved Face-Voice Association"
☆23Dec 31, 2025Updated 6 months ago
Jiang-Yidi / FlatTrajectoryDistillation_FTD
View on GitHub
The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)
☆18Mar 21, 2023Updated 3 years ago
dmlguq456 / NeXt_TDNN_ASV
View on GitHub
Official repository of NeXt-TDNN for speaker verification
☆85Oct 10, 2024Updated last year
JaesungHuh / av-diarization
View on GitHub
Audio-visual diarization pipeline used for creating VoxConverse dataset
☆22Jun 6, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
pengzhendong / speaker-diarization
View on GitHub
Offline Speaker Diarization with SenseVoice by Sherpa ONNX.
☆15Dec 23, 2024Updated last year
IDRnD / redimnet
View on GitHub
The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"
☆205Jul 9, 2026Updated 2 weeks ago
THUsatlab / BERT-LID
View on GitHub
Leveraging BERT to Improve Spoken Language Identification
☆17Nov 22, 2022Updated 3 years ago
mubingshen / MLC-SLM-Baseline
View on GitHub
The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…
☆51May 14, 2025Updated last year
qinxiaoyi / Simple-Attention-Module-based-Speaker-Verification-with-Iterative-Noisy-Label-Detection
View on GitHub
☆12Jun 14, 2022Updated 4 years ago
Maokui-He / NSD-MA-MSE
View on GitHub
A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"
☆62Sep 19, 2024Updated last year
my-yy / vfal_papers
View on GitHub
Voice Face Association Learning Paper List
☆17May 20, 2023Updated 3 years ago
SpeechClub / CDER_Metric
View on GitHub
CDER (Conversational Diarization Error Rate) Scoring Tool
☆22Sep 13, 2022Updated 3 years ago
Purdue-M2 / AI-Synthesized-Voice-Generalization
View on GitHub
This repository is the official implementation of our paper "Improving Generalization for AI-Synthesized Voice Detection", which has been…
☆23Jan 13, 2026Updated 6 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
wentaozhu / speechnas
View on GitHub
SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification
☆30Mar 24, 2023Updated 3 years ago
TaoRuijie / AVCleanse
View on GitHub
ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'
☆44Oct 31, 2022Updated 3 years ago
Junhua-Liao / Light-ASD
View on GitHub
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
☆181Mar 23, 2025Updated last year
Tiago-Roxo / WASD
View on GitHub
☆20Updated this week
Xflick / EEND_PyTorch
View on GitHub
A PyTorch implementation of End-to-End Neural Diarization
☆110Jun 19, 2023Updated 3 years ago
clement-pages / gryannote
View on GitHub
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
☆71Apr 22, 2026Updated 3 months ago
zyzisyz / mfa_conformer
View on GitHub
☆160Jan 9, 2023Updated 3 years ago