zcxu-eric / Ego4d_TalkNet_ASDView external linksLinks
☆21Feb 15, 2022Updated 3 years ago
Alternatives and similar repositories for Ego4d_TalkNet_ASD
Users that are interested in Ego4d_TalkNet_ASD are comparing it to the libraries listed below
Sorting:
- ☆67Sep 13, 2022Updated 3 years ago
- [CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'☆13Jun 16, 2024Updated last year
- Code for the Active Speakers in Context Paper (CVPR2020)☆56May 19, 2021Updated 4 years ago
- Active Speaker Detection☆19Jun 19, 2020Updated 5 years ago
- ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'☆444Oct 23, 2023Updated 2 years ago
- Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset☆72Jan 18, 2022Updated 4 years ago
- ☆13May 9, 2022Updated 3 years ago
- EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset☆59Nov 23, 2020Updated 5 years ago
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆20Mar 17, 2025Updated 10 months ago
- A simple tool to easily use Montreal Forced Aligner. Also provide alignment(TextGrid) retrieved from ESD.☆45May 25, 2023Updated 2 years ago
- Implementation of multi-level Contrastive Predictive Coding (CPC) methods☆20Jan 12, 2023Updated 3 years ago
- ☆20Dec 29, 2024Updated last year
- ☆49Nov 24, 2022Updated 3 years ago
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆54Jan 29, 2024Updated 2 years ago
- Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).☆25Sep 19, 2025Updated 4 months ago
- This repository presents FSD dataset for song deepfake detection.☆25Aug 18, 2025Updated 5 months ago
- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)☆68Oct 29, 2023Updated 2 years ago
- Chinese polyphone disambiguation for Text-to-Speech application☆42Jun 11, 2024Updated last year
- Official PyTorch implementation of the paper "Robust Training for Speaker Verification against Noisy Labels" in INTERSPEECH 2023.☆11Oct 23, 2023Updated 2 years ago
- Neural network-based forced alignment with bidirectional attention mechanism☆78Jan 17, 2025Updated last year
- GAN Step By Step -- GSBS,顾名思义,我希望我自己能够一步一步的学习GAN。GAN 又名 生成对抗网络,是最近几年很热门的一种无监督算法,他能生成出非常逼真的照片,图像甚至视频。GAN是一个图像的全新的领域,从2014的GAN的发展现在,在计算机视觉中…☆11Jan 11, 2023Updated 3 years ago
- Code repository for ‘Adaptive Differential Denoising for Respiratory Sounds Classification’☆20Dec 19, 2025Updated last month
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆74Sep 26, 2022Updated 3 years ago
- code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection☆48May 1, 2023Updated 2 years ago
- The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)☆165Mar 23, 2025Updated 10 months ago
- ☆15Jun 12, 2025Updated 8 months ago
- Resources for "Simple Speech Representation Learning from Perceptual Data".☆11Sep 18, 2023Updated 2 years ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- ☆11Aug 11, 2023Updated 2 years ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. I…☆11Nov 21, 2023Updated 2 years ago
- Transfer Learning from Monolingual ASR to Transcription-free Cross-lingual Voice Conversion☆40Oct 22, 2022Updated 3 years ago
- ☆11Sep 26, 2024Updated last year
- This is the accompanying repository to the paper - Automatic Estimation of Singing Voice Musical Dynamics☆15Oct 28, 2024Updated last year
- ☆11Nov 5, 2021Updated 4 years ago
- ☆10Feb 19, 2021Updated 4 years ago
- ☆12Jun 2, 2018Updated 7 years ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- Twitch (OAuth) authentication strategies for Passport.☆10Apr 18, 2024Updated last year