JaesungHuh / VoxSRC2022View external linksLinks
VoxSRC2022 workshop development kit
☆19Jul 21, 2022Updated 3 years ago
Alternatives and similar repositories for VoxSRC2022
Users that are interested in VoxSRC2022 are comparing it to the libraries listed below
Sorting:
- Evaluation script for VoxMovies dataset in PyTorch☆23Jan 12, 2024Updated 2 years ago
- Augmentation adversarial training for self-supervised speaker recognition☆78Aug 15, 2021Updated 4 years ago
- Look Who’s Talking: Active Speaker Detection in the Wild☆76Aug 24, 2023Updated 2 years ago
- Onset-and-Offset-Aware Sound Event Detection☆20Feb 10, 2025Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 7 months ago
- silero-vad pytorch implement☆34Nov 23, 2024Updated last year
- A simple command line tool to calculate WER for ASR.☆14Oct 14, 2024Updated last year
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 10 months ago
- Highlight code in python multiline strings☆12Oct 10, 2023Updated 2 years ago
- ☆53Oct 17, 2023Updated 2 years ago
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- Anonymous ICLR Submission☆14Sep 25, 2019Updated 6 years ago
- Unsupervised Multi-object Segmentation by Predicting Probable Motion Patterns☆17Nov 15, 2022Updated 3 years ago
- A deepfake audio dataset for detecting fake speech from codec-based speech synthesis systems, Interspeech 2024☆20Jul 27, 2024Updated last year
- ☆17Jun 30, 2020Updated 5 years ago
- ☆14Jul 11, 2022Updated 3 years ago
- [ICASSP'24] Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification☆16Mar 20, 2024Updated last year
- ☆19Apr 18, 2024Updated last year
- Crowdsourced and Automatic Speech Prominence Estimation☆24Apr 12, 2024Updated last year
- ☆15Jul 4, 2024Updated last year
- Layer-wise analysis of self-supervised pre-trained speech representations☆126Oct 18, 2024Updated last year
- Estimating the Age, Height, and Gender of a speaker with their speech signal.☆14Sep 19, 2022Updated 3 years ago
- Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models☆24Jun 16, 2025Updated 8 months ago
- Speech enhancement in noisy and reverberant environments using deep neural networks☆22Oct 10, 2025Updated 4 months ago
- Audio-visual diarization pipeline used for creating VoxConverse dataset☆21Jun 6, 2025Updated 8 months ago
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2021☆18Jul 21, 2021Updated 4 years ago
- Simple diarization model☆53Jun 13, 2025Updated 8 months ago
- Open Source Speech/Text Data on AI☆19Sep 13, 2022Updated 3 years ago
- CDER (Conversational Diarization Error Rate) Scoring Tool☆22Sep 13, 2022Updated 3 years ago
- magicspeech competition recipe☆18Jun 29, 2020Updated 5 years ago
- ☆16Jun 13, 2022Updated 3 years ago
- TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages☆18May 23, 2024Updated last year
- ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'☆44Oct 31, 2022Updated 3 years ago
- Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Buil…☆40Jun 17, 2025Updated 7 months ago
- Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.☆19Apr 22, 2019Updated 6 years ago
- ☆17Aug 27, 2025Updated 5 months ago
- Guess What Moves: Unsupervised Video and Image Segmentation by Anticipating Motion☆25Mar 16, 2023Updated 2 years ago
- ☆49Nov 24, 2022Updated 3 years ago
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆43Jul 17, 2020Updated 5 years ago