☆67Sep 13, 2022Updated 3 years ago
Alternatives and similar repositories for audio-visual
Users that are interested in audio-visual are comparing it to the libraries listed below
Sorting:
- ☆21Feb 15, 2022Updated 4 years ago
- ☆49Nov 24, 2022Updated 3 years ago
- ☆56Aug 7, 2022Updated 3 years ago
- ☆80Sep 4, 2022Updated 3 years ago
- Look Who’s Talking: Active Speaker Detection in the Wild☆76Aug 24, 2023Updated 2 years ago
- ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'☆450Oct 23, 2023Updated 2 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- ☆11Nov 5, 2021Updated 4 years ago
- Splits for epic-sounds dataset☆86Aug 2, 2025Updated 7 months ago
- Active Speaker Detection☆19Jun 19, 2020Updated 5 years ago
- This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one …☆40Mar 13, 2024Updated last year
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Jun 5, 2020Updated 5 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Feb 4, 2020Updated 6 years ago
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 10 months ago
- The History of Speech Recognition to the Year 2030☆13Aug 14, 2021Updated 4 years ago
- ☆13May 9, 2022Updated 3 years ago
- ☆37Jun 28, 2021Updated 4 years ago
- ☆78Jan 5, 2024Updated 2 years ago
- [CVPR 2023] Egocentric Audio-Visual Object Localization☆26Jan 6, 2024Updated 2 years ago
- Decoders from Kaldi using OpenFst☆34Jan 29, 2026Updated last month
- EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset☆59Nov 23, 2020Updated 5 years ago