mavceleb/mavceleb_baseline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mavceleb/mavceleb_baseline)

mavceleb / mavceleb_baseline

☆11

Alternatives and similar repositories for mavceleb_baseline

Users that are interested in mavceleb_baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

msaadsaeed / FOP
View on GitHub
Official implementation of FOP method as described in "Fusion and Orthogonal Projection for Improved Face-Voice Association"
☆23Dec 31, 2025Updated 6 months ago
TaoRuijie / MFV-KSD
View on GitHub
Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)
☆22Jul 25, 2024Updated 2 years ago
msaadsaeed / SBNet
View on GitHub
Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".
☆13Aug 28, 2023Updated 2 years ago
MiukkaZh / MGT
View on GitHub
Learning Domain-Invariant Transformation for Speaker Verification.
☆11Jun 13, 2023Updated 3 years ago
wngh1187 / ExU-Net
View on GitHub
Pytorch implementation of Extended U-Net for Speaker Verification in Noisy Environments
☆28Jul 24, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
xiaoxiaomiao323 / MSA
View on GitHub
☆16Feb 19, 2026Updated 5 months ago
qinxiaoyi / TimeVarying_ASV
View on GitHub
☆12Oct 17, 2024Updated last year
TaoRuijie / AVCleanse
View on GitHub
ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'
☆44Oct 31, 2022Updated 3 years ago
ductuantruong / enskd
View on GitHub
[ICASSP'24] Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification
☆16Mar 20, 2024Updated 2 years ago
Jiang-Yidi / TS-TalkNet
View on GitHub
INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues
☆61May 29, 2023Updated 3 years ago
ybayle / ISM2017
View on GitHub
Reproducible research code for the experiments presented in our article "Kara1k: a karaoke dataset for cover song identification and sing…
☆10Jan 9, 2018Updated 8 years ago
DanielMengLiu / DeepLip
View on GitHub
deep-learning based audio-visual lip bometrics
☆15May 9, 2023Updated 3 years ago
my-yy / vfal_papers
View on GitHub
Voice Face Association Learning Paper List
☆17May 20, 2023Updated 3 years ago
burhanahmed1 / TaskSphere
View on GitHub
Integrated .NET-based desktop framework for dynamic task lifecycle management, featuring relational database connectivity, status trackin…
☆12May 7, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
Tiago-Roxo / WASD
View on GitHub
☆20Updated this week
iiscleap / DIHARD-2019-baseline
View on GitHub
☆16Mar 7, 2019Updated 7 years ago
ahmedembeddedxx / AskFAST
View on GitHub
AskFAST is a chatbot designed to handle admission-related queries for FAST. It’s your go-to AI assistant for all things admission at FAST…
☆11Aug 5, 2024Updated last year
deepaudio / deepaudio-speaker
View on GitHub
neural network based speaker embedder
☆24Jan 7, 2023Updated 3 years ago
ASD0x41 / Assembly-Programming-Package
View on GitHub
Tools in Package: Notepad++, DOSBox, NASM & AFD
☆16Jan 28, 2025Updated last year
liu12366262626 / AlignVSR
View on GitHub
Visual Speech Recongnition
☆21Dec 24, 2024Updated last year
cvqluu / MTL-Speaker-Embeddings
View on GitHub
Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" present…
☆26Oct 5, 2022Updated 3 years ago
TaoRuijie / SEANet
View on GitHub
Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)
☆32Feb 28, 2025Updated last year
ASD0x41 / xide
View on GitHub
An online x86 assembly IDE, containing the Netwide Assembler (NASM), the Advanced Fullscreen Debugger (AFD) and em-dosbox (a WASM port of…
☆23Jan 27, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
atosystem / SSL_Interface
View on GitHub
Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024
☆16Nov 19, 2024Updated last year
liyunlongaaa / AD-TUNING
View on GitHub
AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…
☆11Feb 23, 2024Updated 2 years ago
SSTC-Challenge / SSTC2024_baseline_system
View on GitHub
☆12Jun 14, 2024Updated 2 years ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
dmlguq456 / NeXt_TDNN_ASV
View on GitHub
Official repository of NeXt-TDNN for speaker verification
☆85Oct 10, 2024Updated last year
zds-potato / multilingual-phonetic-sv
View on GitHub
☆10Dec 22, 2023Updated 2 years ago
noiseux1523 / NIST-SRE-2019
View on GitHub
Score Normalization for NIST 2019 Speaker Recognition Evaluation
☆10Nov 8, 2019Updated 6 years ago
kjanjua26 / Git-Loss-For-Deep-Face-Recognition
View on GitHub
This repository contains code for my paper "Git Loss for Deep Face Recognition".
☆35Feb 7, 2021Updated 5 years ago
BornInWater / Overlap-Detection
View on GitHub
Overlapped Speech detection in Multi-party Conversations
☆22Feb 20, 2018Updated 8 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
X-LANCE / MSDWILD
View on GitHub
[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.
☆66Jan 24, 2024Updated 2 years ago
nikvaessen / disjoint-mtl
View on GitHub
Research code for "Towards multi-task learning of speech and speaker recognition" at https://arxiv.org/pdf/2302.12773.pdf
☆12Dec 2, 2024Updated last year
Liu-Tianchi / Nes2Net
View on GitHub
☆111Apr 4, 2026Updated 3 months ago
wngh1187 / RawNeXt
View on GitHub
Pytorch implementation of RawNeXt: Speaker verification system for variable-duration utterance with deep layer aggregation and dynamic sc…
☆25Jun 22, 2022Updated 4 years ago
IDRnD / redimnet
View on GitHub
The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"
☆205Jul 9, 2026Updated 2 weeks ago
plnguyen2908 / UniTalk-ASD-code
View on GitHub
[Interspeech 2026] Revisiting Active Speaker Detection: An In-the-Wild Benchmark for Generalization and Robustness
☆22Jun 25, 2026Updated last month